Neoepitope detection of disease using protein arrays

ABSTRACT

A diagnostic device for and method of detecting the presence of head and neck squamous cell carcinoma (HNSCC) in a patient including a detector device for detecting a presence of at least one marker indicative of HNSCC, the detector device including a panel of markers for HNSCC. A diagnostic device for and method of staging HNSCC in a patient including a detector device for detecting a presence of at least one marker indicative of stages of HNSCC, the detector device including a panel of markers for HNSCC. Markers for head and neck squamous cell carcinoma selected from the markers listed in Table 5. Methods of personalized immunotherapy, making a personalized anti-cancer vaccine, and predicting a clinical outcome in a HNSCC patient. A method of making a panel of HNSCC markers.

CROSS-REFERENCE TO RELATED APPLICATIONS

This Continuation-In-Part Application claims priority to U.S. Continuation-In-Part patent application Ser. No. 11/060,867, filed Feb. 17, 2005, and U.S. Continuation-In-Part patent application Ser. No. 10/004,587, filed Dec. 4, 2001, which is incorporated herein by reference.

GRANT INFORMATION

Research in this application was supported in part by a grant from the National Institute of Health (NIH Grant No. IR21CA100740-01) and a grant from MEDC (Grant No. MLSC 558). The Government has certain rights in the invention.

BACKGROUND OF THE INVENTION

1. Technical Field

The present invention relates to an assay and method for diagnosing disease. More specifically, the present invention relates to an immunoassay for use in diagnosing cancer.

2. Background Art

It is commonly known in the art that genetic mutations can be used for detecting cancer. For example, the tumorigenic process leading to colorectal carcinoma formation involves multiple genetic alterations (Fearon, et al. (1990) Cell 61, 759-767). Tumor suppressor genes such as p53, DCC and APC are frequently inactivated in colorectal carcinomas, typically by a combination of genetic deletion of one allele and point mutation of the second allele (Baker, et al. (1989) Science 244, 217-221; Fearon, et al. (1990) Science 247, 49-56; Nishisho, et al. (1991) Science 253, 665-669; and Groden, et al. (1991) Cell 66, 589-600). Mutation of two mismatch repair genes that regulate genetic stability was associated with a form of familial colon cancer (Fishel, et al. (1993) Cell 75, 1027-1038; Leach, et al. (1993) Cell 75, 1215-1225; Papadopoulos, et al. (1994) Science 263, 1625-1629; and Bronner, et al. (1994) Nature 368, 258-261). Proto-oncogenes such as myc and ras are altered in colorectal carcinomas, with c-myc RNA being overexpressed in as many as 65% of carcinomas (Erisman, et al. (1985) Mol. Cell. Biol. 5, 1969-1976), and ras activation by point mutation occurring in as many as 50% of carcinomas (Bos, et al. (1987) Nature 327, 293-297; and Forrester, et al. (1987) Nature 327, 298-303). Other proto-oncogenes, such as myb and neu are activated with a much lower frequency (Alitalo, et al. (1984) Proc. Natl. Acad. Sci. USA 81, 4534-4538; and D'Emilia, et al. (1989) Oncogene 4, 1233-1239). No common series of genetic alterations is found in all colorectal tumors, suggesting that a variety of such combinations can be able to generate these tumors.

Increased tyrosine phosphorylation is a common element in signaling pathways that control cell proliferation. The deregulation of protein tyrosine kinases (PTKS) through overexpression or mutation has been recognized as an important step in cell transformation and tumorigenesis, and many oncogenes encode PTKs (Hunter (1989) in oncogenes and the Molecular Origins of Cancer, ed. Weinberg (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.), pp. 147-173). Numerous studies have addressed the involvement of PTKs in human tumorigenesis. Activated PTKs associated with colorectal carcinoma include c-neu (amplification), trk (rearrangement), and c-src and c-yes (mechanism unknown) (D'Emilia, et al. (1989), ibid; Martin-Zanca, et al. (1986) Nature 3, 743-748; Bolen, et al. (1987) Proc. Natl. Acad. Sci. USA 84, 2251-2255; Cartwright, et al. (1989) J. Clin. Invest. 83, 2025-2033; Cartwright, et al. (1990) Proc. Natl. Acad. Sci. USA 87, 558-562; Talamonti, et al. (1993) J. Clin. Invest. 91, 53-60; and Park, et al. (1993) Oncogene 8, 2627-2635).

Mutations, such as those disclosed above can be useful in detecting cancer. However, there have been few advancements which can repeatably be used in diagnosing cancer prior to the existence of a tumor. Approximately 40,500 new cases of head and neck squamous cell carcinoma (HNSCC) will be diagnosed in the United States and 11,170 Americans will die from this disease in the year 2006. Worldwide, HNSCC is the sixth most common malignancy with incidence of 644,000 new cases a year. Despite progress in diagnostic and treatment modalities in the past 30 years, long-term survival for patients affected by HNSCC has not significantly improved. One major impediment to improving survival in this patient population is the failure to detect this cancer at an early stage. More than two-thirds of patients with HNSCC are diagnosed at an advanced stage when the five-year survival is less than 40%. In many cases, these patients are offered radical treatments that often result in significant physical disfigurement as well as dysfunction of speech, breathing, and swallowing. The plight of these patients with advanced stage disease is in distinct contrast to that of patients who are diagnosed early. Early stage HNSCC patients have an excellent five-year survival rate of more than 80% and experience significantly less impact on their quality of life after treatment with single modality therapy. This dramatic difference in survival and quality of life underlies the importance of early detection in this disease.

Early detection can be achieved by screening patients at high risk for development of cancer. Although the American Cancer Society has issued guidelines for screening of breast, colon, prostate, and uterine cancers, no such guideline exists for HNSCC. This is especially unfortunate given that patients at increased risk for development of HNSCC can be easily identified (history of excess alcohol and/or tobacco use) and targeted for screening. Early detection can also be improved by reducing diagnostic delays, reported to be between 3 to 5.6 months. Misdiagnosis at initial presentation to primary care physicians is common (44% to 63%) and may be due to the nonspecific nature of presenting symptoms (sore throat, hoarseness, ear pain, etc) as well as the technical difficulty of examination in the head and neck region. The delay in diagnosis and referral to specialist has a significant negative impact on patient outcome and survival. Thus, there exists a need for a simple, noninvasive, and inexpensive test, widely accessible to physicians in the primary care setting, which can be used to screen (in asymptomatic patients) and diagnose (in symptomatic patients) HNSCC in high risk population to improve early detection.

Methods for detecting and measuring cancer markers have been recently revolutionized by the development of immunological assays, particularly by assays that utilize monoclonal antibody technology. Previously, many cancer markers could only be detected or measured using conventional biochemical assay methods, which generally require large test samples and are therefore unsuitable in most clinical applications. In contrast, modern immunoassay techniques can detect and measure cancer markers in relatively much smaller samples, particularly when monoclonal antibodies that specifically recognize a targeted marker protein are used. Accordingly, it is now routine to assay for the presence or absence, level, or activity of selected cancer markers by immunohistochemically staining tissue specimens obtained via conventional biopsy methods. Because of the highly sensitive nature of immunohistochemical staining, these methods have also been successfully employed to detect and measure cancer markers in smaller, needle biopsy specimens which require less invasive sample gathering procedures compared to conventional biopsy specimens. In addition, other immunological methods have been developed and are now well known in the art that allow for detection and measurement of cancer markers in non-cellular samples such as serum and other biological fluids from patients. The use of these alternative sample sources substantially reduces the morbidity and costs of assays compared to procedures employing conventional biopsy samples, which allows for application of cancer marker assays in early screening and low risk monitoring programs where invasive biopsy procedures are not indicated.

For the purpose of cancer evaluation, the use of conventional or needle biopsy samples for cancer marker assays is often undesirable, because a primary goal of such assays is to detect the cancer before it progresses to a palpable or detectable tumor stage. Prior to this stage, biopsies are generally contraindicated, making early screening and low risk monitoring procedures employing such samples untenable. Therefore, there is a general need in the art to obtain samples for cancer marker assays by less invasive means than biopsy, for example, by serum withdrawal.

Efforts to utilize serum samples for cancer marker assays have met with limited success, largely because the targeted markers are either not detectable in serum, or because telltale changes in the levels or activity of the markers cannot be monitored in serum. In addition, the presence of cancer markers in serum probably occurs at the time of micro-metastasis, making serum assays less useful for detecting pre-metastatic disease. Serological analysis of recombinant cDNA expression libraries (SEREX) of tumors with autologous serum is a well established technique which has been used successfully to identify many relevant tumor antigens. However, many of the SEREX antigens identified from a specific cancer patient reacted only with autologous serum antibodies from that particular patient and tend to recognize antibodies only at a low frequency in sera from other cancer patients. Thus, clinical tests based on these SEREX antigens that are patient-specific rather than cancer-specific are insufficient for early detection of cancer.

In view of the above, an important need exists in the art for more widely applicable, non-invasive methods and materials to obtain biological samples for use in evaluating, diagnosing and managing breast and other diseases including cancer, particularly for screening early stage, nonpalpable tumors. A related need exists for methods and materials that utilize such readily obtained biological samples to evaluate, diagnose and manage disease, particularly by detecting or measuring selected cancer markers, or panels of cancer markers, to provide highly specific, cancer prognostic and/or treatment-related information, and to diagnose and manage pre-cancerous conditions, cancer susceptibility, bacterial and other infections, and other diseases.

Autoantibodies against cancer-specific antigens have been identified in cancers of the colon, breast, kidney, lung, ovarian, and head and neck. Immune response with antibody production may be elicited due to the over-expression of cellular proteins such as Her2, the expression of mutated forms of cellular protein such as mutated p53, or the aberrant expression of tissue-restricted gene products such as cancer-testis antigens by cancer cells. Because these autoantibodies are raised against specific antigens from the cancer cells, the detection of these antibodies in a patient's serum can be exploited as diagnostic biomarkers of cancer in that particular patient. Further, the immune system is especially well adapted for the early detection of cancer since it can respond to even low levels of an antigen by mounting a very specific and sensitive antibody response. Thus, the use of immune response as a biosensor for early detection of cancer through serum-based assay holds great potential as an ideal screening and diagnostic tool.

With specific regard to such assays, specific antibodies can only be measured by detecting binding to their antigen or a mimic thereof. Although certain classes of immunoglobulins containing the antibodies of interest can, in some cases, be separated from the sample prior to the assay (Decker, et al., EP 0,168,689 A2), in all assays, at least some portion of the sample immunoglobulins are contacted with antigen. For example, in assays for specific IgM, a portion of the total IgM can be adsorbed to a surface and the sample removed prior to detection of the specific IgM by contacting with antigen. Binding is then measured by detection of the bound antibody, detection of the bound antigen or detection of the free antigen.

For detection of bound antibody, a labeled anti-human immunoglobulin or labeled antigen is normally allowed to bind antibodies that have been specifically adsorbed from the sample onto a surface coated with the antigen, Bolz, et al., U.S. Pat. No. 4,020,151. Excess reagent is washed away and the label that remains bound to the surface is detected. This is the procedure in the most frequently used assays, or example, for hepatitis and human immunodeficiency virus and for numerous immunohistochemical tests, Nakamura, et al., Arch Pathol Lab Med 1122:869-877 (1988). Although this method is relatively sensitive, it is subject to interference from non-specific binding to the surface by non-specific immunoglobulins that can not be differentiated from the specific immunoglobulins.

Another method of detecting bound antibodies involves combining the sample and a competing labeled antibody, with a support-bound antigen, Schuurs, et al., U.S. Pat. No. 3,654,090. This method has its limitations because antibodies in sera bind numerous epitopes, making competition inefficient.

For detection of bound antigen, the antigen can be used in excess of the maximum amount of antibody that is present in the sample or in an amount that is less than the amount of antibody. For example, radioimmunoprecipitation (“RIP”) assays for GAD autoantibodies have been developed and are currently in use, Atkinson, et al., Lancet 335:1357-1360 (1990). However, attempts to convert this assay to an enzyme linked immunosorbent assay (“ELISA”) format have not been successful. The RIP assay is based on precipitation of immunoglobulins in human sera, and led to the development of a radioimmunoassay (“RIA”) for GAD autoantibodies. In both the RIP and the RIA, the antigen is added in excess and the bound antigen:antibody complex is precipitated with protein A-Sepharose. The complex is then washed or further separated by electrophoresis and the antigen in the complex is detected.

Other precipitating agents can be used such as rheumatoid factor or C1q, Masson, et al., U.S. Pat. No. 4,062,935; polyethylene glycol, Soeldner, et al., U.S. Pat. No. 4,855,242; and protein A, Ito, et al., EP 0,410,893 A2. The precipitated antigen can be measured to indicate the amount of antibody in the sample; the amount of antigen remaining in solution can be measured; or both the precipitated antigen and the soluble antigen can be measured to correct for any labeled antigen that is non-specifically precipitated. These methods, while quite sensitive, are all difficult to carry out because of the need for rigorous separation of the free antigen from the bound complex, which requires at a minimum filtration or centrifugation and multiple washing of the precipitate.

Alternatively, detection of the bound antigen can be employed when the amount of antigen is less than the maximum amount of antibody. Normally, that is carried out using particles such as latex particles or erythrocytes that are coated with the antigen, Cambiaso, et al., U.S. Pat. No. 4,184,849 and Uchida, et al., EP 0,070,527 A1. Antibodies can specifically agglutinate these particles and can then be detected by light scattering or other methods. It is necessary in these assays to use a precise amount of antigen as too little antigen provides an assay response that is biphasic and high antibody titers can be read as negative, while too much antigen adversely affects the sensitivity. It is therefore necessary to carry out sequential dilutions of the sample to assure that positive samples are not missed. Further, these assays tend to detect only antibodies with relatively high affinities and the sensitivity of the method is compromised by the tendency for all of the binding sites of each antibody to bind to the antigen on the particle to which it first binds, leaving no sites for binding to the other particle.

For assays in which the free antigen is detected, the antigen can also be added in excess or in a limited amount although only the former has been reported. Assays of this type have been described where an excess of antigen is added to the sample, the immunoglobulins are precipitated, and the antigen remaining in the solution is measured, Masson, et al., supra and Soeldner, et al., supra. These assays are relatively insensitive because only a small percentage change in the amount of free antigen occurs with low amounts of antibody, and this small percentage is difficult to measure accurately.

Practical assays in which the free antigen is detected and the antigen is not present in excess of the maximum amount of antibody expected in a sample have not been described. However, in van Erp, et al., Journal of Immunoassay 12(3):425-443 (1991), a fixed concentration of monoclonal antibody was incubated with a concentration dilution series of antigen, and free antigen was then measured using a gold sol particle agglutination immunoassay to determine antibody affinity constants.

There has been much research in the area of evaluating useful markers for determining the risk factor for patients developing IDDM. These include insulin autoantibodies, Soeldner, et al., supra and circulating autoantibodies to glutamic acid decarboxylase (“GAD”), Atkinson, et al., PCT/US89/05570 and Tobin, et al., PCT/US91/06872. In addition, Rabin, et al., U.S. Pat. No. 5,200,318 describes numerous assay formats for the detection of GAD and pancreatic islet cell antigen autoantibodies. GAD autoantibodies are of particular diagnostic importance because they occur in preclinical stages of the disease, which can make therapeutic intervention possible. However, the use of GAD autoantibodies as a diagnostic marker has been impeded by the lack of a convenient, nonisotopic assay.

One assay method involves incubating a support-bound antigen with the sample, then adding a labeled anti-human immunoglobulin. This is the basis for numerous commercially available assay kits for antibodies such as the Syn ELISA kit which assays for autoantibodies to GAD65, and is described in product literature entitled “Syn^(ELISA) GAD II-Antibodies” (Elias USA, Inc.). Substantial dilution of the sample is required because the method is subject to high background signals from adsorption of non-specific human immunoglobulins to the support.

Many of the assays described above involve detection of antibody that becomes bound to an immobilized antigen. This can have an adverse affect on the sensitivity of the assay due to difficulty in distinguishing between specific immunoglobulins and other immunoglobulins in the sample, which bind non-specifically to the immobilized antigen. There is not only a need to develop an assay that avoids non-specific detection of immunoglobulins, but there is also the need for an improved method of detecting antibodies that combines the sensitivity advantage of immunoprecipitation assays with a simplified protocol. Finally, assays that can help evaluate the risk of developing diseases are medically and economically very important. The present invention addresses these needs.

SUMMARY OF THE INVENTION

The present invention provides for a diagnostic device for use in detecting the presence of HNSCC in a patient including a detector device for detecting a presence of at least one marker indicative of HNSCC, the detector device including a panel of markers for HNSCC.

The present invention also provides for a diagnostic device for use in staging HNSCC in a patient including a detector device for detecting a presence of at least one marker indicative of stages of HNSCC, the detector device including a panel of markers for HNSCC.

Markers for HNSCC are provided selected from the markers listed in Table 5.

A method of diagnosing HNSCC is provided, including the steps of detecting markers in the serum of a patient indicative of the presence of HNSCC, and diagnosing the patient with HNSCC.

A method of staging HNSCC is provided, including the steps of detecting markers in the serum of a patient indicative of a stage of HNSCC, and determining the stage of HNSCC. A method of staging cancer is also similarly provided.

Further provided is a method of personalized immunotherapy, including the steps of detecting markers in the serum of a patient with the above diagnostic device, analyzing reactivity of markers in the serum to markers in the panel, identifying markers in the serum with the highest reactivity, and using the markers identified as immunotherapeutic agents personalized to the immunoprofile of the patient. A method of personalized targeted therapy is similarly provided.

A method of making a personalized anti-cancer vaccine is provided, including the steps of detecting markers in the serum of a patient with the above diagnostic device, analyzing reactivity of markers in the serum to markers in the panel, identifying markers in the serum with the highest reactivity, and formulating an anti-cancer vaccine using the identified markers.

Also provided is a method of predicting a clinical outcome in a HNSCC patient, including the steps of analyzing a pattern of reactivity of a patient's serum with a panel of HNSCC markers, and predicting a clinical outcome.

A method of making a panel of head and neck squamous cell carcinoma markers is provided, including the steps of creating HNSCC cDNA libraries, replicating the HNSCC cDNA libraries, performing differential biopanning, selecting clones to be arrayed on a protein microarray, immunoreacting the microarray against HNSCC patient serum, and selecting highly reactive clones for placement on a panel.

DESCRIPTION OF THE DRAWINGS

Other advantages of the present invention are readily appreciated as the same becomes better understood by reference to the following detailed description when considered in connection with the accompanying drawings wherein:

FIG. 1 shows the matrix of reactivity between sets of clones coming from patients 1-12 (in rows) and sera from same patients (in columns). At this point (step 2 of Procedure 2), the matrix contains the results of the self-reactions: patients 1-10 have a specific self-reaction whereas patients 11 and 12 do not, and patients 11 and 12 are eliminated from the clone selection procedure;

FIG. 2 shows a matrix of reactivity between sources of clones and different sera ordered by reactivity; the clones from patient 2 react with sera from self (column 2) and patients 4 and 8; the clones from patient 3 react with sera from self (column 3) and patients 6 and 10, etc. Note that the union of the set of clones coming from patients 2, 3, 5, 7 and 1 ensures that the chip made with these clones reacts with all patients;

FIG. 3 is a schema showing the process of the present invention;

FIGS. 4A and 4B are photographs showing phage clones spotted in replication of six in an ordered array onto nitrocellulose coated glass slides;

FIGS. 5A and 5B are photographs showing protein microarrays immunoreacted against a serum sample from HNSCC patient (FIG. 5A) and serum from a control patient (FIG. 5B);

FIG. 6 is a schema showing the process of calculating the real accuracy of the process of the present invention; and

FIGS. 7A and 7B are graphs showing the link between the immunoreaction level of the different clones and the class membership of the samples (cancer versus healthy).

DETAILED DESCRIPTION OF THE INVENTION

Generally, the present invention provides a method and markers for use in detecting disease and stages of disease. In other words, the markers can be used to determine the presence of disease without requiring the presence of symptoms. Particularly, the present invention provides a method and markers for use in detecting HNSCC and stages of HNSCC, as well as various other diagnostic methods, which are further described below.

The present invention can further be understood in light of the following terms and definitions.

By “bodily fluid” as used herein it is meant any bodily fluid known to those of skill in the art to contain antibodies therein. Examples include, but are not limited to, blood, saliva, tears, spinal fluid, serum, and other fluids known to those of skill in the art to contain antibodies.

By “biopanning”, it is meant a selection process for use in screening a library (Parmley and Smith, Gene, 73:308 (1988); Noren, C. J., NEB Transcript, 8(1); 1 (1996)). Biopanning is carried out by incubating phages encoding the peptides with a plate coated with the proteins, washing away the unbound phage, eluting, and amplifying the specifically bound phage. Those skilled in the art readily recognize other immobilization schemes that can provide equivalent technology, such as but not limited to binding the proteins or other targets to beads.

By “staging” the disease, as for example in cancer, it is intended to include determining the extent of a cancer, especially whether the disease has spread from the original site to other parts of the body. The stages can range from 0 to 5 with 0 being the presence of cancerous cells and 5 being the spread of the cancer cells to other parts of the body including the lymph nodes. Further, the staging can indicate the stage of a borderline histology. A borderline histology is a less malignant form of disease. Additionally, staging can indicate a relapse of disease, in other words the reoccurrence of disease.

The term “marker” as used herein is intended to include, but is not limited to, a gene or a piece of a gene which codes for a protein, a protein such as a fusion protein, open reading frames such as ESTs, epitopes, mimotopes, antigens, and any other indicator of immune response. Each of these terms is used interchangeably to refer to a marker. The marker can also be used as a predictor of disease or the recurrence of disease.

“Mimotope” refers to a random peptide epitope that mimics a natural antigenic epitope during epitope presentation, which is further included in the invention. Such mimotopes are useful in the applications and methods discussed below. Also included in the present invention is a method of identifying a random peptide epitope. In the method, a library of random peptide epitopes is generated or selected. The library is contacted with an anti-antibody. Mimotopes are identified that are specifically immunoreactive with the antibody. Sera (containing anti antibodies) or antibodies generated by the methods of the present invention can be used. Random peptide libraries can, for example, be displayed on phage (phagotopes) or generated as combinatorial libraries.

“Antibody” refers to a polypeptide comprising a framework region from an immunoglobulin gene or fragments thereof that specifically binds and recognizes an antigen. The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, epsilon, and mu constant region genes, as well as the various immunoglobulin diversity/joining/variable region genes. Light chains are classified as either kappa or lambda. Heavy chains are classified as gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobulin classes, IgG, IgM, IgA, IgD and IgE, respectively.

An exemplary immunoglobulin (antibody) structural unit comprises a tetramer. Each tetramer is composed of two identical pairs of polypeptide chains, each pair having one “light” (about 25 kDa) and one “heavy” chain (about 50-70 kDa). The N-terminus of each chain defines a variable region of about 100 to 110 or more amino acids primarily responsible for antigen recognition. The terms variable light chain (V_(L)) and variable heavy chain (V_(H)) refer to these light and heavy chains respectively.

Antibodies exist, e.g., as intact immunoglobulins or as a number of well-characterized fragments produced by digestion with various peptidases. Thus, for example, pepsin digests an antibody below the disulfide linkages in the hinge region to produce F(ab)′₂, a dimer of Fab which itself is a light chain joined to V_(H)—C_(H) 1 by a disulfide bond. The F(ab)′₂ can be reduced under mild conditions to break the disulfide linkage in the hinge region, thereby converting the F(ab)′₂ dimer into an Fab′ monomer. The Fab′ monomer is essentially Fab with part of the hinge region (see Fundamental Immunology (Paul ed., 3d ed. 1993). While various antibody fragments are defined in terms of the digestion of an intact antibody, one of skill can appreciate that such fragments can be synthesized de novo either chemically or by using recombinant DNA methodology. Thus, the term antibody, as used herein, also includes antibody fragments either produced by the modification of whole antibodies, or those synthesized de novo using recombinant DNA methodologies (e.g., single chain Fv) or those identified using phage display libraries (see, e.g., McCafferty, et al., Nature 348:552-554 (1990)).

For preparation of monoclonal or polyclonal antibodies, any technique known in the art can be used (see, e.g., Kohler & Milstein, Nature 256:495-497 (1975); Kozbor, et al., Immunology Today 4: 72 (1983); Cole, et al., pp. 77-96 in Monoclonal Antibodies and Cancer Therapy (1985)). Techniques for the production of single chain antibodies (U.S. Pat. No. 4,946,778) can be adapted to produce antibodies to polypeptides of this invention. Also, transgenic mice, or other organisms such as other mammals, can be used to express humanized antibodies. Alternatively, phage display technology can be used to identify antibodies and heteromeric Fab fragments that specifically bind to selected antigens (see, e.g., McCafferty, et al., Nature 348:552-554 (1990); Marks, et al., Biotechnology 10:779-783 (1992)).

A “chimeric antibody” is an antibody molecule in which (a) the constant region, or a portion thereof, is altered, replaced or exchanged so that the antigen binding site (variable region) is linked to a constant region of a different or altered class, effector function and/or species, or an entirely different molecule which confers new properties to the chimeric antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the variable region, or a portion thereof, is altered, replaced or exchanged with a variable region having a different or altered antigen specificity.

The term “immunoassay” is an assay wherein an antibody specifically binds to an antigen. The immunoassay is characterized by the use of specific binding properties of a particular antibody to isolate, target, and/or quantify the antigen. In addition, an antigen can be used to capture or specifically bind an antibody.

The phrase “specifically (or selectively) binds” to an antibody or “specifically (or selectively) immunoreactive with,” when referring to a protein or peptide, refers to a binding reaction that is determinative of the presence of the protein in a heterogeneous population of proteins and other biologics. Thus, under designated immunoassay conditions, the specified antibodies bind to a particular protein at least two times the background and do not substantially bind in a significant amount to other proteins present in the sample. Specific binding to an antibody under such conditions can require an antibody that is selected for its specificity for a particular protein. For example, polyclonal antibodies raised to modified β-tubulin from specific species such as rat, mouse, or human can be selected to obtain only those polyclonal antibodies that are specifically immunoreactive, e.g., with β-tubulin modified at cysteine 239 and not with other proteins. This selection can be achieved by subtracting out antibodies that cross-react with other molecules. Monoclonal antibodies raised against modified β-tubulin can also be used. A variety of immunoassay formats can be used to select antibodies specifically immunoreactive with a particular protein. For example, solid-phase ELISA immunoassays are routinely used to select antibodies specifically immunoreactive with a protein (see, e.g., Harlow & Lane, Antibodies, A Laboratory Manual (1988), for a description of immunoassay formats and conditions that can be used to determine specific immunoreactivity). Typically a specific or selective reaction can be at least twice background signal or noise and more typically more than 10 to 100 times background.

A “label” or a “detectable moiety” is a composition detectable by spectroscopic, photochemical, biochemical, immunochemical, or chemical means. For example, useful labels include ³²P, fluorescent dyes, iodine, electron-dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and proteins for which antisera or monoclonal antibodies are available, e.g., by incorporating a radiolabel into the peptide, or any other label known to those of skill in the art.

A “labeled antibody or probe” is one that is bound, either covalently, through a linker or a chemical bond, or noncovalently, through ionic, van der Waals, electrostatic, or hydrogen bonds to a label such that the presence of the antibody or probe can be detected by detecting the presence of the label bound to the antibody or probe.

The terms “isolated” “purified” or “biologically pure” refer to material that is substantially or essentially free from components that normally accompany it as found in its native state. Purity and homogeneity are typically determined using analytical chemistry techniques such as polyacrylamide gel electrophoresis or high performance liquid chromatography. A protein that is the predominant species present in a preparation is substantially purified. The term “purified” denotes that a nucleic acid or protein gives rise to essentially one band in an electrophoretic gel. Particularly, it means that the nucleic acid or protein is at least 85% pure, optionally at least 95% pure, and optionally at least 99% pure.

The term “recombinant” when used with reference, e.g., to a cell, or nucleic acid, protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been modified by the introduction of a heterologous nucleic acid or protein or the alteration of a native nucleic acid or protein, or that the cell is derived from a cell so modified. Thus, for example, recombinant cells express genes that are not found within the native (non-recombinant) form of the cell or express native genes that are otherwise abnormally expressed, under expressed or not expressed at all.

An “expression vector” is a nucleic acid construct, generated recombinantly or synthetically, with a series of specified nucleic acid elements that permit transcription of a particular nucleic acid in a host cell. The expression vector can be part of a plasmid, virus, or nucleic acid fragment. Typically, the expression vector includes a nucleic acid to be transcribed operably linked to a promoter.

By “support or surface” as used herein, the term is intended to include, but is not limited to a solid phase which is typically a support or surface, which is a porous or non-porous water insoluble material that can have any one of a number of shapes, such as strip, rod, particle, including beads and the like. Suitable materials are well known in the art and are described in, for example, U.S. Pat. No. 5,185,243 to Ullman, et al., columns 10-11, U.S. Pat. No. 4,868,104 to Kurn, et al., column 6, lines 2142 and U.S. Pat. No. 4,959,303 to Milburn, et al., column 6, lines 14-31, which are incorporated herein by reference. Binding of ligands and receptors to the support or surface can be accomplished by well-known techniques, readily available in the literature. See, for example, “Immobilized Enzymes,” Ichiro Chibata, Halsted Press, New York (1978) and Cuatrecasas, J. Biol. Chem. 245:3059 (1970). Whatever type of solid support is used, it must be treated so as to have bound to its surface either a receptor or ligand that directly or indirectly binds the antigen. Typical receptors include antibodies, intrinsic factor, specifically reactive chemical agents such as sulfhydryl groups that can react with a group on the antigen, and the like. For example, avidin or streptavidin can be covalently bound to spherical glass beads of 0.5-1.5 mm and used to capture a biotinylated antigen.

Signal producing system (“sps”) includes one or more components, at least one component being a label, which generates a detectable signal that relates to the amount of bound and/or unbound label, i.e. the amount of label bound or not bound to the compound being detected. The label is any molecule that produces or can be induced to produce a signal, such as a fluorescer, enzyme, chemiluminescer, or photosensitizer. Thus, the signal is detected and/or measured by detecting enzyme activity, luminescence, or light absorbance.

Suitable labels include, by way of illustration and not limitation, enzymes such as alkaline phosphatase, glucose-6-phosphate dehydrogenase (“G6PDH”) and horseradish peroxidase; ribozyme; a substrate for a replicase such as Q-beta replicase; promoters; dyes; fluorescers such as fluorescein, isothiocyanate, rhodamine compounds, phycoerythrin, phycocyanin, allophycocyanin, o-phthaldehyde, and fluorescamine; chemiluminescers such as isoluminol; sensitizers; coenzymes; enzyme substrates; photosensitizers; particles such as latex or carbon particles; suspendable particles; metal sol; crystallite; liposomes; cells, etc., which can be further labeled with a dye, catalyst, or other detectable group. Suitable enzymes and coenzymes are disclosed in U.S. Pat. No. 4,275,149 to Litman, et al., columns 19-28, and U.S. Pat. No. 4,318,980 to Boguslaski, et al., columns 10-14; suitable fluorescers and chemiluminescers are disclosed in U.S. Pat. No. 4,275,149 to Litman, et al., at columns 30 and 31; which are incorporated herein by reference. Preferably, at least one sps member is selected from the group consisting of fluorescers, enzymes, chemiluminescers, photosensitizers, and suspendable particles.

The label can directly produce a signal, and therefore, additional components are not required to produce a signal. Numerous organic molecules, for example fluorescers, are able to absorb ultraviolet and visible light, where the light absorption transfers energy to these molecules and elevates them to an excited energy state. This absorbed energy is then dissipated by emission of light at a second wavelength. Other labels that directly produce a signal include radioactive isotopes and dyes.

Alternately, the label may need other components to produce a signal, and the sps can then include all the components required to produce a measurable signal, which can include substrates, coenzymes, enhancers, additional enzymes, substances that react with enzymatic products, catalysts, activators, cofactors, inhibitors, scavengers, metal ions, specific binding substance required for binding of signal generating substances, and the like. A detailed discussion of suitable signal producing systems can be found in U.S. Pat. No. 5,185,243 to Ullman, et al., columns 11-13, which is incorporated herein by reference.

The label is bound to a specific binding pair (hereinafter “sbp”) member which is the antigen, or is capable of directly or indirectly binding the antigen, or is a receptor for the antigen, and includes, without limitation, the antigen; a ligand for a receptor bound to the antigen; a receptor for a ligand bound to the antigen; an antibody that binds the antigen; a receptor for an antibody that binds the antigen; a receptor for a molecule conjugated to an antibody to the antigen; an antigen surrogate capable of binding a receptor for the antigen; a ligand that binds the antigen, etc. Binding of the label to the sbp member can be accomplished by means of non-covalent bonding, for example, by formation of a complex of the label with an antibody to the label, or by means of covalent bonding, for example, by chemical reactions which result in replacing a hydrogen atom of the label with a bond to the sbp member or can include a linking group between the label and the sbp member. Such methods of conjugation are well known in the art. See for example, U.S. Pat. No. 3,817,837 to Rubenstein, et al., which is incorporated herein by reference. Other sps members can also be bound covalently to sbp members. For example, in U.S. Pat. No. 3,996,345 to Ullman, et al., two sps members such as a fluorescer and quencher can be bound respectively to two sbp members that both bind the analyte, thus forming a fluorescer-sbp₁:analyte:sbp₂-quencher complex. Formation of the complex brings the fluorescer and quencher in close proximity, thus permitting the quencher to interact with the fluorescer to produce a signal. This is a fluorescent excitation transfer immunoassay. Another concept is described in, EP 0,515,194 A2 to Ullman, et al., which uses a chemiluminescent compound and a photosensitizer as the sps members. This is referred to as a luminescent oxygen channeling immunoassay. Both the aforementioned references are incorporated herein by reference.

The method and markers of the present invention can be used to diagnose the presence of a disease or a disease stage in a patient, as well as in other diagnostic methods as mentioned above. Specifically, a diagnostic device for use in detecting the presence of HNSCC in a patient is provided and includes a detector device for detecting the presence of at least one marker in the serum of the patient.

The detector includes, but is not limited to, an assay, a slide, a filter, a microarray, a macroarray, computer software implementing data analysis methods, and any combinations thereof. For example, the detector can be an immunoassay such as ELISA. The detector can also include a two-color detection system or other detector system known to those of skill in the art.

The detector also includes a panel of markers that are indicative of the presence of disease. The panel of markers can include markers that are known to those of skill in the art and markers determined utilizing the methodology disclosed herein. Examples of diseases that the markers detect include, but are not limited to, gynecological sickness such as endometriosis, ovarian cancer, breast cancer, cervical cancer, HNSCC, and primary peritoneal carcinoma. The method can also be used to identify overexpressed or mutated proteins in tumor cells. That such proteins are mutated or overexpressed presumably is the basis for the immune reaction to these proteins. Therefore, markers identified using these methods could provide markers for molecular pathology as diagnostic or prognostic markers. In the present invention, the markers are preferably used to detect HNSCC. Preferably, the markers are selected from those listed in Table 5.

A method of making a panel of head and neck squamous cell carcinoma markers is provided, including the steps of creating HNSCC cDNA libraries, replicating the HNSCC cDNA libraries, performing differential biopanning, selecting clones to be arrayed on a protein microarray, immunoreacting the microarray against HNSCC patient serum, and selecting highly reactive clones for placement on a panel. The method of making the detector and panel are further described in the examples below as well.

This diagnostic device can be used to diagnose HNSCC by detecting markers in the serum of a patient indicative of the presence of HNSCC, and diagnosing the patient with HNSCC. As further described below, the serum of the patient is compared to the panel of markers in the detector device, and based on the reactivity of the serum with the panel, a diagnosis is made.

The present invention also provides for a diagnostic device for use in staging HNSCC in a patient including a detector device for detecting a presence of at least one marker indicative of stages of HNSCC, wherein the detection device includes a panel of markers for HNSCC. The detector is essentially the same as described above; however, the detector is created to determine the stage of HNSCC. Relevant markers are selected according to each stage of HNSCC desired in order to make the panel of HNSCC markers. This panel can then be used in the diagnostic device to test a patient's serum in order to determine the stage of their HNSCC. Knowing the stage of HNSCC can aid in selecting appropriate treatments.

The present invention also provides for a more general method of staging cancer, by detecting RNA or protein levels of markers that are overexpressed or altered due to mutation in the serum of a patient indicative of a stage of cancer with the diagnostic device, and determining the stage of the cancer.

A method of personalized immunotherapy is provided, including the steps of detecting markers in the serum of a patient with the diagnostic device as described above, analyzing reactivity of markers in the serum to markers in the panel of the diagnostic device, identifying markers in the serum with the highest reactivity, and using the markers identified as immunotherapeutic agents personalized to the immunoprofile of the patient. Thus, the immunotherapy is targeted to a person's immunoprofile based on the panel of markers.

According to the present invention, the immunotherapeutic agents are preferably immunotherapeutic agents for HNSCC; however, the immunotherapeutic agents can also be directed to other diseases such as, but not limited to, those mentioned above. For personalized immunotherapy, the reactivity to particular epitope clones can be correlated using sera from patients having cancer. Using a comprehensive panel of epitope markers that can accurately detect early stage HNSCC, one can utilize these antigens as immunotherapeutic agents personalized to the immunoprofile of each patient. When T-cells from the patient recognize antigen biomarkers, the T-cells are stimulated, activated and therefore produce an immune-response. Such reactivity demonstrates the potential of each antigen as a component of a vaccine to induce a T cell-mediated immune response essential for generation of cancer vaccines. Individuals scoring positive in the presymptomatic testing for HNSCC are then offered an anti-cancer vaccine tailored to their immunoprofile against a panel of tumor antigens.

Like the method of personalize immunotherapy, the present invention also provides for a method of personalized targeted therapy by detecting markers that are overexpressed or altered due to mutation in the serum of a patient with the diagnostic device as described above, analyzing reactivity of markers in the serum to markers in the panel of the diagnostic device, identifying markers in the serum with the highest reactivity, and using the markers identified as therapeutic targets personalized to the patient. In other words, this method is directed to different types of therapy and not just immunotherapy.

Treatment of a patient can be altered based upon the markers detected. For example, the treatment can be specifically designed based upon the markers identified. In other words, the therapy can be altered to most suitably treat the identified markers, such that the treatment is designed to most efficiently treat the identified marker. Thus, the treatment is personalized according to the patient's needs. The ability to adjust the therapy enables the treatment to be tailored to the needs of the person being treated. The treatments that can be used range from vaccines to chemotherapy.

The present invention therefore also provides for a method of making a personalized anti-cancer vaccine, including the steps of detecting markers in the serum of a patient with the diagnostic device as described above, analyzing reactivity of markers in the serum to markers in the panel of the diagnostic device, identifying markers in the serum with the highest reactivity, and formulating an anti-cancer vaccine using the identified markers. Preferably, the anti-cancer vaccine is an anti-HNSCC vaccine.

As further described in the examples below, a method is also provided of predicting a clinical outcome in a HNSCC patient, including the steps of analyzing a pattern of reactivity of a patient's serum with a panel of HNSCC markers, and predicting a clinical outcome. The detector device including the panel as described above can be used to analyze the patient's serum. Further, the clinical outcome predicted includes, but is not limited to, a response to a particular therapeutic intervention or chemotherapeutic drug, survival, and development of neck or distant metastasis.

Explanations as well as examples are provided below detailing the methods and devices of the present invention.

The analysis of mRNA expression in tumors does not necessarily reveal the status of protein levels in the cancer cells. Other factors such as protein half-life and mutation can be altered without an effect on mRNA levels thus masking significant molecular changes at the protein level. Serum antibody reactivity to cellular proteins occurs in cancer patients due to presentation of mutated forms of proteins from the tumor cells or overexpression of proteins in the tumor cells. The host immune system can direct individuals to molecular events critical to the genesis of the disease. Using a candidate gene approach, experience has shown that the frequency of serum positivity to any single protein is low. Therefore, to increase the identification of such autoantigens, a more global approach is employed to exploit immunoreactivity to identify large numbers of cDNAs coding for proteins that are mutated or upregulated in cancer cells.

In order to develop an effective screening test for early detection of HNSCC cancer, cDNA phage display libraries are used to isolate cDNAs coding for epitopes reacting with antibodies present specifically in the sera of patients with HNSCC cancer. The methods of the present invention detect various antibodies that are produced by patients in reaction to proteins overexpressed in their HNSCC tumors. This is achievable by differential biopanning technology using human sera collected both from normal individuals and patients having HNSCC cancer and phage display libraries expressing cDNAs of genes expressed in HNSCC tumors and cell lines. Serum reactivity toward a cellular protein can occur because of the presentation to the immune system of a mutated form of the protein from the tumor cells or overexpression of the protein in the tumor cells. The strategy provides for the identification of epitope-bearing phage clones (phagotopes) displaying reactivity with antibodies present in sera of patients having HNSCC cancer but not in control sera from unaffected individuals. This strategy leads to the identification of novel disease-related epitopes for diseases including, but not limited to, HNSCC cancer, that have prognostic/diagnostic value with additional potential for therapeutic vaccines and medical imaging reagents. This also creates a database that can be used to determine both the presence of disease and the stage of the disease.

The series of experiments disclosed herein provide direct evidence that biopanning a T7 coat protein fusion library can isolate epitopes for antibodies present in polyclonal sera. This also showed that the technology can be applied to direct microarray screening of large numbers of selected phage against numerous patient and control sera. This approach provides a large number of biomarkers for early detection of disease.

More specifically, the methods of the present invention provide four to five cycles of affinity selection and biopanning which are carried out with biological amplification of the phage after each biopanning, meaning growth of the biological vector of the cDNA expression clone in a biological host. Examples of biological amplification include, but are not limited to, growth of a lytic or lysogenic bacteriophage in host bacteria or transformation of bacterial host with selected DNA of the cDNA expression vector. The number of biopanning cycles generally determines the extent of the enrichment for phage that binds to the sera of patient with HNSCC cancer. This strategy allows for one cycle of biopanning to be performed in a single day. Someone skilled in the art can establish different schedules of biopanning that provide the same essential features of the procedure described above.

Two biopanning experiments are performed with each library differentially selecting clones between control and disease patient sera. The first selection is to isolate phagotope clones that do not bind to control sera pooled from control individuals but do bind to a pool of disease patient serum. This set of phagotope clones represent epitopes that are indicative of the presence of disease as recognized by the host immune system. The second type of screening is performed to isolate phagotope clones that did not bind to a pool of control sera but do bind to an individual patient's serum. Those sets of phagotope clones represent epitopes that are indicative of the presence of disease.

Subsequent to the biopanning, the clones so isolated can be used to contact antibodies in sera by spotting the clones or peptide sequences of amino acids containing those encoded by the clones. After spotting on a solid support, the arrays are rinsed briefly in a 1% BSA/PBS to remove unbound phage, then transferred immediately to a 1% BSA/PBS blocking solution and allowed to sit for one hour at room temperature. The excess BSA is rinsed off from the slides using PBS. This step insures that the elution step of antibodies is more effective. The use of PBS elutes all of the antibodies without harming the binding of the antibody. Antibody detection of reaction with the clones or peptides on the array is carried out by labeling of the serum antibodies or through the use of a labeled secondary antibody that reacts with the patient's antibodies. A second control reaction to every spot allows for greater accuracy of the quantitation of reactivity and increases sensitivity of detection.

The slides are subsequently processed to quantify the reaction of each phagotopes. Such processing is specific to the label used. For instance, if fluorophore cy3-cy5 labels are used, this processing is done in a laser scanner that captures an image of the slide for each fluorophore used. Subsequent image processing familiar to those skilled in the art can provide intensity values for each phagotope. The data analysis can be divided into the following steps:

-   -   1. Pre-processing and normalization.     -   2. Identifying the most informative markers     -   3. Building a predictor for molecular diagnosis of HNSCC cancer         and validating the results.

The purpose of the first step is to cleanse the data from artifacts and prepare it for the subsequent steps. Such artifacts are usually introduced in the laboratory and include: slide contamination, differential dye incorporation, scanning and image processing problems (e.g. different average intensities from one slide to another), imperfect spots due to imperfect arraying, washing, drying, etc. The purpose of the second step is to select the most informative phages that can be used for diagnostic purposes. The purpose of the third step is to use a software classifier able to diagnose cancer based on the antibody reactivity values of the selected phages. The last step also includes the validation of this classifier and the assessment of its performance using various measures such as specificity, sensitivity, positive predictive value and negative predictive value. The computation of such measures can be done on cases not used during the design of the chip in order to assess the real-world performance of the diagnosis tool obtained.

The pre-processing and normalization step is performed for arrays using two channels such as Cy5 for the human IgG and Cy3 for the T7 control, the spots are segmented, and the mean intensity is calculated for each spot. A mean intensity value is calculated for the background, as well. A background corrected value is calculated by subtracting the background from the signal. If necessary, non-linear dye effects can be eliminated by performing an exponential normalization (Houts, 2000) and/or LOESS normalization of the data and/or a piecewise linear normalization. The values obtained from each channel are subsequently divided by their mean of the intensities over the whole array. Subsequently, the ratio between the IgG and the T7 channels was calculated. The values coming from replicate spots (spots printed in quadruplicates) are combined by calculating mean and standard deviation. Outliers (outside+/−two standard deviations) are flagged for manual inspection). Single channel arrays are pre-processed in a similar way but without taking the ratios. This preprocessing sequence was shown to provide good results for all preliminary data analyzed.

The step of selecting the most informative markers is used to identify the most informative phages out of the large set of phages started with. The better the selection, the better is the expected accuracy of the diagnosis tool.

A first test is necessary to determine whether a specific epitope is suitable for inclusion in the final set to be spotted. The selection methods to be applied follow the principles of the methods successfully applied in (Golub et al., 1999; Alizadeh et al., 2000) and can be briefly described in the following.

Procedure 1

The procedure is initiated defining a template for the cancer case. Unlike gene expression experiments where the expression level of a gene can be either up or down in cancer vs. healthy subjects, here the presence of antibodies specific to cancer are tested for. Therefore, epitopes with high reactivity in controls and low reactivity in patients are not expected and the profile is sufficient. Each epitope can have a profile across the given set of patients. The profile of each epitope is compared with the templates using a correlation-based distance. Those skilled in the art can recognize that other distances can be used without essentially changing the procedure.

The epitopes are then prioritized based on the similarity between the reference profile and their actual profile. For example, as detailed in U.S. Ser. No. 11/060,867 (incorporated herein by reference), 46 epitopes were found to be informative for a correlation threshold of 0.8. The final cutoff threshold is calculated by doing 1000 random permutations once the whole data set become available. Each such permutation moves randomly the subjects between the ‘patient’ and ‘control’ categories. Calculating the score of each epitope profile for such permutations allows for the establishment of a suitable threshold for the similarity (Golub et. al. 1999).

The technique follows closely the one used in (Golub, 1999). However, the technique can be further improved as follows. Firstly, this technique was shown to provide good results if most controls are consistent by providing the same type of reactivity. However, preliminary data showed that there are control subjects that show a non-specific reactivity with all clones, while still clearly different from patients. Such control subjects with a high non-specific reaction introduces spikes in the clone profile in the area corresponding to the control subjects. These spikes decrease the score of the relevant clones making them more difficult to distinguish from the irrelevant ones. In order to reduce this effect, all control subjects with a non-specific response (i.e. a unimodal distribution) were eliminated from the analysis leading to the epitope selection.

A second essential modification is related to the set of epitopes selected. There are rare patients who might react only to a small number of very specific epitopes. If the selection of the epitopes is done on statistical grounds alone, such very specific epitopes can be missed if the set of patients available contains only few such rare patients. In order to maximize the sensitivity of the penultimate test resulted from this work, every effort was made to include epitopes which might be the only ones reacting to rare patients. In order to do this, the information content of the set of epitopes is maximized while trying to minimize the number of epitopes used using the following procedure.

Procedure 2

It is assumed that there are m patients and k controls. n random patients are selected from the m available. For each of the n patients used for epitope selection, amplification is performed (n×4 biopannings) as well as self-reactions. Those patients/epitopes that do not react to themselves are eliminated.

A chip is made with all available, self-reacting epitopes printed in quadruplicates. This chip is reacted with all patients and controls (n+k antibody reactions). Controls are eliminated with a non-specific reactivity. For the set of epitopes coming from a single patient, Procedure 1 is applied to order the epitopes in the order of their informational content and the ones that can be used to differentiate patients from controls are selected.

The epitopes are ordered by their reactivity in decreasing order of the number of patients they react to. This list is scanned from the top down, and epitopes are moved from this list to the final set. Every time a set of epitopes from a patient x is added to the final set, the patient x and all other patients that these epitopes react to are represented in the current set of epitopes. This is repeated until all patients are represented in the current set of epitopes.

This procedure minimizes the number of epitopes used while maximizing the number of patients that react to the chip containing the selected epitopes.

The following example shows how this procedure works using a simple example. The matrix in FIG. 1 contains a row i for the clones coming from patient i and a column j for the serum coming from patient j. A serum is said to react specifically with a set of clones if the histogram of the ratios is bimodal. A serum is said to react non-specifically if the histogram of the ratio is unimodal. Furthermore, a serum might not react at all with a set of clones. If the serum from patient j reacts specifically with the clones from patient i, the matrix can contain a value of 1 at the position (i, j). The element at position (i, j) is left blank if the there is no reaction or the reaction is non-specific.

Each set of epitopes corresponding to a row of the matrix is pruned by sub-selecting epitopes according to Procedure 1. The rows are now sorted in decreasing reactivity (number of patients other than self that the clones react to). For instance, in FIG. 2, the clones from patient 2 react with sera from self (column 2) and patients 4 and 8. The clones from patient 3 react with sera from self (column 3) and patients 6 and 10, etc. The final set of clones was obtained from patients 2, 3, 5, 7 and 1 (reading top-down in column 1). Clones coming from patients 8, 9 and 10 are not included since these patients already react to clones coming from other patients. This set ensures that the chip made with these clones reacts with all patients in this example.

Procedure 3

Arrays using two channels such as Cy5 for the human IgG and Cy3 for the T7 control are processed as follows. The spots are segmented and the mean intensity is calculated for each spot. A mean intensity value is calculated for the background, as well. A background corrected value is calculated by subtracting the background from the signal. The values coming from each channel are normalized by dividing by their mean. Subsequently, the ratio between the IgG and the T7 channels are calculated and a logarithmic function is applied. The values coming from replicate spots (spots printed in quadruplicates) are combined by calculating mean and standard deviation. Outliers (outside+/−two standard deviations) are flagged for manual inspection. Someone skilled in the art can recognize that various combinations and permutations of the steps above or similar could replace the normalization procedure above without substantially changing rest of the data analysis process. Such similar steps include without limitation taking the median instead of the mean, using logarithmic functions in various bases, etc.

The histogram of the average log ratio is calculated. If the histogram is unimodal there is no specific response. If the histogram is clearly bimodal, there is a specific response. All subjects analyzed fell in one of these two categories or had no response at all. A mixed probability model is used in less clear cases to fit two normal distributions as in (Lee, 2000). If the two distributions found under the maximum likelihood assumption are separated by a distance d of more than 2 standard deviations (corresponding to a p-value of approximately 0.05), there is a specific response. If the distance is less than 2 standard deviations, the response can be considered as not specific. The preliminary data analyzed so far showed a very good separation of the distributions for the patients.

Once the chosen clones are spotted on the final version of the array, a number of sera coming from both patients and controls can be tested. These sera come from subjects not used in any of the phases that lead to the fabrication of the array (i.e. not involved in clone selection, not used as controls, etc.). Each test was evaluated using Procedure 3 above. The performance on this validation data can be reported in terms of PPV, NPV, specificity and sensitivity. Since these performance indicators are calculated on data not previously used, they provide a good indication of the performance of the test for screening purposes for the various categories of patients envisaged in the general population.

The present invention also provides a kit including all of the technology for performing the above analysis. This is included in a container of a size sufficient to hold all of the required pieces for analyzing sera, as well as a digital medium such as a floppy disk or CDROM containing the software necessary to interpret the results of the analysis. These components include the array of clones or peptides spotted onto a solid support, prewashing buffers, a detection reagent for identifying reactivity of the patients' serum antibodies to the spotted clones or peptides, post-reaction washing buffers, primary and secondary antibodies to quantify reactivity of the patients' serum antibodies with the spotted array and methods to analyze the reactivity so as to establish an interpretation of the serum reactivity.

A biochip, otherwise known as a biosensor, for detecting the presence of the disease state in a patient's sera is provided by the present invention. The biochip has a detector contained within the biochip for detecting antibodies in a patient's sera. This allows a patient's sera to be tested for the presence of a multitude of diseases or reaction to disease markers using a single sample and the analysis can be conducted and analyzed on a single chip. By utilizing such a chip, the time required for the detection of disease is lowered while also enabling a doctor to determine the level of disease spread or infection. The chip, or other informatics system can be altered to weigh the results. In other words, the informatics can be altered to adjust the levels of sensitivity and/or specificity of the chip.

The above discussion provides a factual basis for the use of the combination of markers and method of making the combination. The methods used with a utility of the present invention can be shown by the following non-limiting examples and accompanying figures.

EXAMPLE 1

Combining phage display technology with protein microarray technology, 5,133 selectively cloned tumor antigens were screened and ranked using a feature selection method based on receiver-operator characteristic curves for neural network classifiers. A model was built using the top-ranked 40 biomarkers. The entire modeling strategy, both feature selection and model development, was validated by bootstrapping on an independent set of 80 cancer and 78 control sera. Estimated accuracy of this modeling strategy was 82.9% (95% CI 77.2-87.9) with sensitivity of 83.2% (95% CI 74.0-91.6) and specificity of 82.7% (95% CI 74.5-93.6). The accuracy of this novel diagnostic platform represents a significant improvement over current diagnostic accuracy of 37% to 56% in the primary care setting. Further, the diagnostic test described can be easily translated into assays such as enzyme linked immunosorbent assay that is already widely available in clinical practice and familiar to clinical laboratories. This facilitates wide-spread application of this assay as a simple tool to screen (in asymptomatic patients) and diagnose (in symptomatic patients) HNSCC in high risk population.

Methods

HNSCC and control patients were recruited from the otolaryngology clinic population.

Construction of T7 phage display cDNA libraries. HNSCC specimens were obtained at the time of surgical extirpation and poly-A RNA extracted and purified. The construction of T7 phage cDNA display libraries was performed using Novagen's OrientExpress cDNA Synthesis (Random primer system) and Cloning System as per manufacturer's suggestions.

Differential biopanning of HNSCC phage display cDNA libraries. Differential biopannings using sera from control and HNSCC patients were performed as per the manufacturer's protocol (T7Select System, TB178).

Protein microarray immunoreaction. Individual clones were picked and arrayed in replicates of 6 onto FAST™ slides (Schleicher & Schuell) using a robotic microarrayer Prosys 5510TL (Cartesian Technologies).

Microarray data analysis. Following immunoreaction, the microarrays were scanned with an Axon Laboratories 4100A scanner (Axon Laboratories) using 635 nm and 532 nm lasers to produce a red (AlexaFluor647) and green (AlexaFluor532) composite image. Using the ImaGene™ 6.0 (Biodiscovery) image analysis software, the binding of each of the cancer-specific peptides with IgGs in each serum was then analyzed and expressed as a ratio of red to green fluorescent intensities. The microarray data were further processed by a sequence of transformations including background correction, omission of poor quality spots, log-transformation, normalization by subtracting the global median (in log scale), then combining of 6 spot replicates to yield a mean value for each marker.

Sequencing of phage cDNA clones. Individual phage clones were PCR amplified using forward primer 5′GTTCTATCCGCAACGTTATGG3′ and reverse primer 5′GGAGGAAAGTCGTTTTTTGGGG3′ and sequenced using forward primer by Wayne State University Sequencing Core Facility.

Serum samples. Blood samples from HNSCC patients (Stages I-IV) and control patients were obtained after informed consent. Both HNSCC and control patients were recruited from the otolaryngology clinic population. All enrolled HNSCC patients have cancer confirmed on pathology. Control patients underwent thorough head and neck examination and imaging to rule out the presence of cancer after they initially presented with nonspecific head and neck symptoms such as sore throat, hoarseness, dysphagia, coughing, choking and gasping, neck mass, and otalgia. Blood samples were centrifuged at 2,500 rpm at 4° C. for 15 minutes and supernatants were stored at −70° C.

Construction of T7 phage display cDNA libraries. Head and neck cancer specimens were obtained at the time of surgical extirpation and immediately placed in RNA later solution (Ambion). Total RNA extraction was performed using TRIZOL reagent (Invitrogen Corporation) per the manufacturer's protocol. After extraction, poly-A RNAs were purified twice using Straight A mRNA Isolation System (EMD Biosciences) per protocols from the manufacturer. The construction of T7 phage cDNA display libraries was performed using Novagen's OrientExpress cDNA Synthesis (Random primer system) and Cloning System as per the manufacturer's suggestions (Novagen)

Differential biopanning of HNSCC phage display cDNA libraries. Differential biopannings using sera from normal healthy patients and HNSCC patients were performed as per manufacturer's protocol (Novagen: T7Select System, TB178). Protein G Plus-agarose beads were used for serum immunoglobulins (IgGs) immobilization. Three to five rounds of biopanning were performed using serum from each of the 12 HNSCC patients. Each cycle of biopanning consisted of passing the entire phage library through protein-G beads coated with IgGs from pooled sera of healthy controls, passage through beads coated with IgGs from individual serum from HNSCC patients, followed by final elution of bound phage clones from the beads.

Protein microarray immunoreaction. Individual clones were picked and arrayed in replicates of six onto FAST™ slides (Schleicher & Schuell) using a robotic microarrayer Prosys 5510TL (Cartesian Technologies) with 32 Micro-Spofting Pins (TeleChem). Protein microarrays were blocked with 4% milk in 1×PBS for one hour at room temperature followed by another hour of incubation with primary antibodies consisting of human serum at a dilution of 1:300 in PBS, mouse anti-T7 capsid antibodies (0.15 μg/ml) (EMD Biosciences, Madison, Wis.), and BL21 E. coli cell lysates (5 μg/ml). The microarrays were then washed three times in PBS/0.1% Tween-20 solution four minutes each at room temperature and then incubated with AlexaFluor647 (red fluorescent dye)-labeled goat anti-human IgG antibodies (1 μg/ml) and AlexaFluor532 (green fluorescent dye)-labeled goat anti-mouse IgG antibodies (0.05 μg/ml) (Molecular Probes, Eugene, Oreg.) for one hour in the dark. Finally, the microarrays were washed three times in PBS/0.1% Tween-20 four minutes each, then twice in PBS for two minutes each and air dried.

Results

Differential biopanning results in enrichment of T7 phage HNSCC cDNA display libraries. T7 phage cDNA display libraries were constructed from three HNSCC specimens (floor of mouth, base of tongue, and larynx). The insertion of foreign HNSCC cDNAs into the T7 phage capsid genes results in the production of fusion capsid proteins. Foreign peptides displayed in this fashion have been shown to fold in their native conformations, thus exposing both linear and conformational antigens on the surface of the bacteriophage where they are accessible for selection and analysis.

A potential limitation of the T7 phage display system, however, is the absence of post-translational modifications such as glycosylation, sulfation, and phosphorylation which can influence the folding and binding of these peptides. Each of these three cDNA phage libraries was found to contain between 10⁶ and 10⁷ primary recombinants. Since the majority of the clones in the HNSCC cDNA libraries carried normal self-proteins, differential biopanning was performed in order to enrich the cDNA libraries with clones expressing the HNSCC-specific peptides (FIG. 3). The technique relied on specific antigen-antibody reactions to remove clones binding to immunoglobulins present in control sera from non-cancer patients while retaining clones with peptides of interest (HNSCC-specific antigens) binding to antibodies in HNSCC sera that serve as the bait. In order to increase the diversity of HNSCC-specific peptides, the cDNA libraries were biopanned against sera from 12 HNSCC patients with tumors representing different subsites of head and neck (Table 1a and 1b).

FIG. 3 depicts a schema showing the process of combining phage-display technology, protein microarrays, and bioinformatics tools to profile and select a panel of 40 unique clones from an initial 107 clones in the three HNSCC cDNA phage display libraries. Three cDNA libraries were constructed from HNSCC specimens. Because the majority of the clones in the HNSCC cDNA libraries carried normal self-proteins, subtractive biopanning was performed in order to enrich the cDNA libraries with clones expressing the HNSCC-specific peptides. The technique relied on specific antigen-antibody reactions to remove clones binding to immunoglobulins (IgGs) from control sera while retain clones with peptides of interest (HNSCC-specific antigens) using antibodies in HNSCC sera as bait. Protein G Plus-agarose beads were used for serum IgGs immobilization. Three to five rounds of biopanning were performed using serum from each of the 12 HNSCC patients. Each cycle of biopanning consisted of passing the entire phage library through protein-G beads coated with IgGs from pooled sera of healthy controls, passage through beads coated with IgGs from individual serum from HNSCC patients, followed by final elution of bound phage clones from the column. Following biopanning, a total of 5,133 clones were randomly chosen from the 12 highly enriched pools of T7 phage cDNA libraries. These clones were arrayed and immunoreacted against serum samples from 39 HNSCC and 41 control patients. The binding of arrayed HNSCC-specific peptides with antibodies in sera was quantified with the AlexaFluor647 (red-fluorescent dye)-labeled anti-human antibody. Any small variation in the amount of phage particles spotted throughout the microarray was quantified by measuring the amount of mouse anti-T7 capsid antibodies bound using AlexaFluor532 (green fluorescent dye)-labeled goat anti-mouse IgG antibody. Following immunoreaction, the microarray data was analyzed and processed by a sequence of transformations. To reduce the number of clones for further analysis, one-tailed t-test was used to select 1,021 clones (from the original 5,133 clones) with increased reactivity to cancer sera compared to control sera (p<0.1). Sera from 80 cancer and 78 non-cancer controls, not previously used for biopanning or selection of clones, were immunoreacted against the previously selected 1,021 HNSCC-specific peptides. The reactivity of each of the 1,021 cancer-specific peptides with each of these 158 sera was then analyzed. Following data normalization and transformation, the top 100 clones based on one-tailed t-test were selected. Subsequently these 100 clones were re-ranked based on their performance in a neural network model. The performance measure used was the area under the curve (AUC) taken as an average over 200 bootstrap trials. These clones were then sequenced and analyzed for homology to mRNA and genomic entries in the GenBank databases using BLASTn. A classifier, based on a three-layer feed-forward neural network, was then built using the top unique biomarkers. Several models were created using the increasing number of features from the top ranked clones. An input set size of 40 clones was found to be a good compromise between the performance and complexity of the classifier.

High throughput protein microarray immunoscreening for selection of informative HNSCC-specific biomarkers. A total of 5,133 clones were randomly chosen from the 12 highly enriched pools of T7 phage clones from these cDNA libraries. These clones were arrayed and immunoreacted against serum samples from 39 HNSCC patients (Table 2a) and 41 controls (Table 2b). The binding of arrayed HNSCC-specific peptides with antibodies in sera was quantified with the AlexaFluor647 (red-fluorescent dye)-labeled anti-human antibody. The amount of phage particles spotted throughout the microarray was quantified by measuring the amount of phage at each spot using a mouse monoclonal antibody to the T7 capsid protein quantified using AlexaFluor532 (green fluorescent dye)-labeled goat anti-mouse IgG antibody. The ratio of AlexFluor647 intensity over AlexaFluor532 intensity was then calculated in order to account for any small variation in the serum antibody binding to antigens due to different amounts of phage particles spotted on the microarray (FIG. 4).

FIG. 4 shows phage clones that were spotted in replication of six in an ordered array onto FAST™ nitrocellulose coated glass slides. The reactivity of each clone with each serum sample was analyzed and expressed as a ratio of red fluorescent intensity to green fluorescent intensity. The binding of arrayed HNSCC-specific peptides with antibodies in sera was detected with the AlexaFluor647 (red-fluorescent dye)-labeled anti-human antibody. The use of mouse anti-T7 capsid antibodies, detected with the use of AlexaFluor532 (green fluorescent dye)-labeled goat anti-mouse IgG antibody, was necessary in order to normalize for any small variation in the amount of phage particles spotted throughout the microarray chip.

Following immunoreaction, the microarray data was processed by a sequence of transformations and then analyzed. To reduce the number of clones for further analysis, one-tailed t-tests under the R environment (v2.3.0) were used to select clones with increased binding to immunoglobulins present in cancer sera compared to control sera using the criterion of p<0.10; 1021 clones met the criterion. In general, the visually positive clones (yellow or orange spots) that reacted with cancer sera but not control sera (FIG. 5) corresponded to those for which the t-test was statistically significant.

FIG. 5 shows the protein microarrays immunoreacted against a serum sample from HNSCC patient (left panel) and serum from control patient (right panel). The visually positive clones are represented by yellow or orange spots in replication of six. These visually positive clones corresponded to the statistically significant clones selected based on the t-test.

Ranking of top clones based on protein microarray immunoreaction. Sera from 80 HNSCC patients and 78 controls, not previously used for biopanning or selection of clones, were immunoreacted against the previously selected 1,021 HNSCC-specific peptides. Of the 80 HNSCC patients, 18 had early stage (I and II) and 62 had advanced stage (III and IV) disease. The distribution of early and late stage disease reflects the distribution of HNSCC in clinical practice. HNSCC from almost all subsites of head and neck were represented (Table 3a). Cases and controls are similar in age, race, and gender (Table 3b). The reactivity of each of the 1,021 cancer-specific peptides with each of these 158 sera was then analyzed. Following data normalization and transformation, clones were ranked on the basis of the attained significance levels from the one-tailed t-tests and the top 100 clones were selected. Subsequently these 100 clones were re-ranked based on their performance in a neural network model using area under the ROC curve (AUC) averaged over 200 bootstrap trials. These 100 clones were then sequenced and analyzed for homology to mRNA and genomic entries in the GenBank databases using BLASTn. The predicted amino acids were also determined in-frame with the T7 gene 10 capsid protein. A list of unique clones was generated by eliminating duplicate clones as well as those clones containing truncated peptides with fewer than 5 amino acids.

Validation of modeling strategy. A classifier, based on a three-layer feed-forward neural network, was then built using the top unique biomarkers, using the nnet package under the R environment (v2.3.0). Several models were created using increasing numbers of features from the top ranked clones. An input set size of 40 clones was found to be a good compromise between the performance and complexity of the classifier. Special attention was paid to avoid data overfitting by using a reduced number of hidden nodes (n=4) as well as using a training method (Broyden-Fletcher-Goldfarb-Shanno) that included regularization, as implemented in the nnet package. The performance of the classifier was estimated by averaging over 100 bootstrap samples from the original data set (80 cancer and 78 control sera) to obtain an accuracy of 82.9% (95% CI 77.2-87.9) with sensitivity of 83.2% (95% CI 74.0-91.6) and specificity of 82.7% (95% CI 74.5-93.6) (FIG. 6). The classifier was able to detect early stage HNSCC at least as well as late stage cancers. In fact, the accuracy on early stage cancers (86.7%) was slightly better than the accuracy on late stage cancers (82.9%). The performance of this classifier in detecting cancer from different subsites of head and neck region was 85% (glottis), 86.2% (supraglottis), 90% (hypopharynx), 83.5% (oropharynx), 95% (nasopharynx), and 78.6% (oral cavity).

Of the top 40 unique clones, there were five clones that contained known gene products in the reading frame of the T7 gene 10 capsid protein. These included ubiquinone-binding protein, NADH dehydrogenase subunit 1, pp 21 homolog (transcription elongation factor A (SII)-like 7), multiple myeloma overexpression gene 2, and C10 protein (Table 4). The remaining 35 clones contained peptides that were different from the original proteins coded by the inserted gene fragments. This occurred because the inserted gene fragments were out of frame with the open reading frame of the T7 10B gene (n=14) (Table 5a), represented untranslated region of known genes (n=11) (Table 5b), or contained sequences from unknown genes (n=10) (Table 5c). It is likely that the recombinant gene products of these clones mimic some other natural antigens, and hence can be termed mimotopes. BLASTp search of the SWISSPROT database for homology to each in-frame mimotope revealed that many of these gene products mimic known cancer proteins and as such represent putative tumor antigens.

FIG. 6 shows the schema used to calculate the real accuracy from apparent accuracy and optimism. A classifier (C), based on the top 40 clones, was trained and tested using 80 cancer and 78 control samples. The resulting accuracy (96.2%) is called the apparent accuracy and is an optimistic estimate of the true accuracy. Estimation of the real-world performance of the classifier methodology (feature selection and model training) was evaluated by averaging over 100 bootstrap samples. Each bootstrap run involved resampling by drawing from the original dataset with replacement. For every bootstrap sample, a set of 40 features was selected which performed best on this new arrangement of the data and a classifier (C′) was trained based on that data and these features. The classifier C′ was subsequently applied to both the original data and bootstrap datasets. The difference between the accuracies on these two data sets provided a measure of the “optimism” embedded in the apparent accuracy. The expected real-world performance (82.9%) was obtained by subtracting the mean values of the optimism (13.3%) from the apparent accuracy parameter (96.2%). The confidence intervals of the estimated performance indices were determined from the empirical distribution of the optimism value.

Evaluation of neural network classifier performance using 10-fold cross validation. Sera from 80 HNSCC patients and 78 controls, not previously used for biopanning or selection of clones, were immunoreacted against the previously selected 1,021 HNSCC-specific peptides. Of the 80 HNSCC patients, 18 had early stage (I and II) and 62 had advanced stage (III and IV) disease, reflecting the distribution of HNSCC in clinical practice. HNSCC from almost all subsites of head and neck were represented (Table 3a). In order to reflect the target screening population, control sera used were taken from patients who presented with symptoms or exams similar to that of head and neck cancer patients. Many of these control patients also have history of moderate to excessive tobacco and/or alcohol use. Cases and controls are similar in age, race, and gender (Table 3b).

A ten-fold cross-validation procedure was used to assess the performance of neural network model for cancer classification based of patterns of serum immunoreactivity against a panel of biomarkers. In this scheme, the data was split into 10 equal parts, balanced with respect to the cancer and control groups. Both the clone (feature) selection and model training were based on 9/10^(th) of the data and the model was tested on the remaining 1/10^(th) of the dataset. Feature selection based on training data (142 samples) included several steps. First, clones that immunoreacted, on average, less with sera from cancer patients than controls were discarded. The remaining clones were then ranked using the p-value from a t-test and the top 250 clones were used individually to derive a ROC curve. Finally, the top 130 clones ranked in decreasing order of area under the ROC curve (AUC) were retained to build a classification model, based on a three-layer feed-forward neural network, using the nnet package³² under the R environment (v2.3.0). Special attention was paid to avoid data overfitting by using a reduced number of hidden nodes (n=5) as well as using a training method (Broyden-Fletcher-Goldfarb-Shanno) which included regularization, as implemented in the nnet package. The resulting classifier was then tested against the independent test set of 16 samples. The entire 10-fold cross-validation was repeated 100 times in order to minimize any potential bias due to random partition of training and test sets (FIG. 2).

The classification method yields an average accuracy of 74.6% (95% CI, 52.5% to 96.7%) with AUC of 82.3%, sensitivity of 73.1%, and specificity of 76.1%. Notably this classifier was able to detect early stage HNSCC (72.8%) at least as well as late stage cancers (73.2%). The sensitivity of this classifier in detecting cancer from different subsites of head and neck region was 73.9% (glottis), 72.6% (supraglottis), 83.7% (hypopharynx), 74.9% (oropharynx), 87.5% (nasopharynx), 67.3% (oral cavity), and 60% (unknown primary). To further verify that there is a true link between the immunoreaction level of the different clones and the class membership of the samples (cancer vs. healthy), the class identifiers were randomly permuted among the patients and recalculated the cross-validation procedure with the permuted data. As expected, the estimate of accuracy and AUC obtained in these permuted cases were around 50% and are statistically significantly different (p<1e−15) from the accuracy (74.6%) and AUC (82.3%) obtained using the actual class identifiers (FIG. 3).

Characterization of the panel of 130 biomarker panel. Clones were ranked based on the number of times they were selected and used in the 130 biomarker panel out of 1,000 possible reiterations (10-fold×100 different partitions) (Table 4). The top 130 markers were sequenced and analyzed for homology to mRNA and genomic entries in the GenBank databases using BLASTn. The predicted amino acids were also determined in-frame with the phage T7 gene 10 capsid protein. Of the top 130 clones, there were 8 clones that contained known gene products in the reading frame of the T7 gene 10 capsid protein. These included multiple myeloma overexpression gene 2, ubiquinone binding protein, NADH dehydrogenase subunit 1, C10 protein, and a hypothetical protein LOC400242 (Table 5). The remaining 122 clones contained peptides that were different from the original proteins coded by the inserted gene fragments. This occurred because the inserted gene fragments were out of frame with the open reading frame of the T7 10B gene (n=61) (Table 6a), represented untranslated region of known genes (n=18) (Table 6b), or contained sequences from unknown genes (n=43) (Table 6c). It is likely that the recombinant gene products of these clones mimic some other natural antigens, and hence can be termed mimotopes. BLASTp search of the SWISSPROT database for homology to each in-frame mimotope revealed that many of these gene products mimic known cancer proteins and as such represent putative tumor antigens.

Discussion

Protein microarray technology was used for high throughput quantitative analysis of the antibody-antigen reaction between 238 serum samples and 5,133 cloned cancer antigens pre-selected via biopanning. Using specialized bioinformatics, the massive dataset was mined to identify a panel of top 130 cancer biomarkers (FIG. 1). Many of these markers represent or mimic known cancer antigens. Based on this panel of 130 cancer biomarkers, a trained neural network classifier has a sensitivity of 73.1% and specificity of 76.1% in discriminating cancer from noncancer serum samples in an independent test set. The performance of this classifier represents a significant improvement over current diagnostic accuracy in the primary care setting with reported sensitivity of 37% to 46% and specificity of 24%.

The results presented in this study indicate the potential of a new platform for cancer detection based on analysis of pattern of serum immunoreactivity against a panel of cancer antigens. The pattern of immunoreactivity is highly reproducible. In addition, serum IgGs were found to be extremely stable, which should minimize interlaboratory variations in clinical diagnostics setting. A significant advantage of the approach is the potential to translate this technology into simple assay systems, such the enzyme-linked immunosorbent assay (ELISA), which is already widely available in clinical practice and familiar to clinical laboratories. This facilitates the wide-spread application of this assay as a simple tool in the primary care setting to screen (in asymptomatic patients) and diagnose (in symptomatic patients) HNSCC in high risk population.

Finally, the pattern of reactivity of biomarkers with serum samples from HNSCC can be analyzed to develop other classifiers capable of predicting clinical outcome and thereby guiding the most optimal therapeutic treatments. For example, the clinical outcomes that can be predicted include, but are not limited to, response to a particular therapeutic intervention or chemotherapeutic drug, survival, and development of neck or distant metastasis. These biomarkers also have utility in post-treatment monitoring of HNSCC patients and provide new targets for therapeutic interventions or diagnostic imaging in future clinical trials. Because the host immune system can reveal molecular events (overexpression or mutation) critical to the genesis of HNSCC, this proteomics technology can also identify genes with mechanistic involvement in the etiology of the disease.

In conclusion, using a novel technology based on a combination of high throughput antigen selection using microarray-based serological profiling and specialized bioinformatics identified a panel of antigen biomarkers that can provide sufficient accuracy for a clinically relevant, serum-based cancer detection test based on the pattern of serum immunoglobulin binding.

Throughout this application, various publications, including United States patents, are referenced by author and year and patents by number. Full citations for the publications are listed below. The disclosures of these publications and patents in their entireties are hereby incorporated by reference into this application in order to more fully describe the state of the art to which this invention pertains.

The invention has been described in an illustrative manner, and it is to be understood that the terminology that has been used is intended to be in the nature of words of description rather than of limitation.

Obviously, many modifications and variations of the present invention are possible in light of the above teachings. It is, therefore, to be understood that within the scope of the appended claims, the invention can be practiced otherwise than as specifically described.

TABLE 1a Tumor subsites and stages of the 12 HNSCC sera used in the subtractive biopanning. Stage Stage Stage Subsites Stage I II III IV Total Supraglottis 0 0 0 3 3 Glottis 1 0 0 1 2 Nasopharynx 0 0 0 0 0 Oropharynx 0 0 0 3 3 Hypopharynx 0 0 0 0 0 Oral Cavity 0 0 0 3 3 Unknown 0 0 0 1 1 Total 1 0 0 11 12

TABLE 1b Patient characteristics of the 12 HNSCC patients and 9 control patients whose sera were used in the subtractive biopanning. HNSCC Control Number of patients 12  9 Mean Age (range) 61.6 (46-87) 38 (22-56) Race African American 6 5 Caucasian 6 3 Hispanics 0 1 Gender Male 11  5 Female 1 4

TABLE 2a Tumor subsites and stages of the 39 HNSCC sera used for immunoscreening of the original 5,134 clones. Stage Stage Stage Subsites Stage I II III IV Total Supraglottis 1 2 1 5 9 Glottis 4 0 2 1 7 Nasopharynx 1 0 0 0 1 Oropharynx 1 1 0 8 10 Hypopharynx 0 0 1 0 1 Oral Cavity 1 2 0 5 8 Cervical 0 0 0 1 1 Esophagus Unknown 0 0 0 2 2 Total 8 5 4 22 39

TABLE 2b Patient characteristics of the 39 HNSCC patients and 41 control patients whose sera were used for immunoscreening of the original 5,134 clones. HNSCC Control Number of patients 39 41 Mean Age (range) 60 (26-87) 47 (20-76) African American 22 27 Caucasian 17 13 Hispanics  0  1 Gender Male 35 27 Female  4 14

TABLE 3a Tumor subsites and stages of the 80 HNSCC sera used for immunoscreening of the 1,021 clones arrayed on the master protein microchips. Subsites Stage I Stage II Stage III Stage IV Total Supraglottis 2 1 5 2 10 Glottis 2 1 3 8 14 Nasopharynx 1 1 0 2 4 Oropharynx 1 1 0 14 16 Hypopharynx 0 1 3 5 9 Oral Cavity 2 5 1 13 21 Cervical 0 0 0 1 1 Esophagus Unknown 0 0 0 5 5 Total 8 10 12 50 80

TABLE 3b Patient characteristics of the 80 HNSCC patients and 78 control patients whose sera were used for immunoscreening of the 1,021 clones arrayed on the final protein microchips. HNSCC Control Number of patients 80 78 Mean Age (range) 60 (27-81) 51 (19-74) Race African American 45 45 Caucasian 35 33 Gender Male 59 57 Female 21 21

TABLE 4 Ranking of clones based on the number of times out of 1,000 possible rankings they were selected in the 130 biomarker panel used to build the neural network classifier. Number of times selected Name Rank out of 1000 possibile times 1_H3 1 1000 1_H8 2 1000 10_C3 3 1000 10_F12 4 1000 10_G8 5 1000 10_H6 6 1000 2_D4 7 1000 2_D8 8 1000 2_G11 9 1000 2_G8 10 1000 2_H4 11 1000 5_C8 12 1000 6_C8 13 1000 6_D4 14 1000 6_G11 15 1000 6_G12 16 1000 7_C11 17 1000 7_C4 18 1000 7_G4 19 1000 10_B11 20 999 10_G11 21 999 11_C4 22 999 2_D3 23 999 3_D4 24 999 9_C4 25 999 1_D7 26 998 10_C9 27 998 7_C3 28 998 9_G4 29 998 9_G8 30 998 11_B12 31 997 2_C10 32 996 2_G4 33 996 9_G12 34 995 11_C8 35 994 5_G3 36 994 8_C4 37 994 10_C4 38 993 10_G5 39 993 10_G7 40 993 5_C11 41 991 1_D12 42 990 10_D6 43 985 10_D12 44 982 4_G8 45 982 1_D11 46 980 5_H8 47 979 8_G11 48 979 1_D2 49 978 6_H8 50 976 9_C12 51 976 11_B8 52 974 9_F12 53 972 10_B12 54 969 2_G7 55 965 1_G4 56 954 5_C4 57 954 4_G7 58 953 1_H2 59 952 9_G7 60 949 10_H8 61 947 11_C7 62 947 2_G3 63 946 2_D6 64 943 10_F10 65 942 10_G12 66 934 8_G3 67 933 1_H7 68 931 2_C8 69 919 12_D10 70 907 9_H4 71 904 10_H11 72 889 2_H3 73 877 2_C12 74 871 1_G3 75 868 11_B10 76 866 5_C3 77 864 2_C11 78 857 11_C9 79 847 5_H4 80 845 1_C12 81 843 10_D7 82 840 2_H7 83 830 10_E12 84 819 9_C3 85 810 11_C3 86 809 2_C2 87 809 5_C7 88 807 11_A1 89 796 10_F11 90 795 2_F3 91 787 2_H8 92 785 2_G2 93 782 10_F8 94 775 10_H3 95 774 11_C1 96 761 9_C2 97 759 2_C3 98 728 6_C12 99 728 10_B10 100 710 12_C4 101 697 2_C7 102 669 8_B12 103 659 5_D4 104 653 5_G4 105 652 8_D4 106 644 1_H4 107 634 1_D6 108 598 4_G4 109 587 10_B4 110 581 11_A2 111 571 10_E8 112 570 10_C1 113 558 8_H10 114 558 6_G8 115 551 9_C6 116 541 1_D4 117 538 6_C4 118 528 6_G3 119 498 12_C12 120 495 1_D8 121 489 5_C6 122 479 12_D2 123 453 6_F4 124 453 1_G8 125 442 8_G8 126 431 9_H7 127 424 9_D11 128 418 6_C3 129 417 10_D3 130 408 11_D4 131 406 2_F4 132 405 10_C8 133 403 4_B3 134 401 4_F8 135 400 9_B12 136 389 2_D2 137 374 11_B6 138 373 4_C8 139 360 2_C1 140 349 5_G6 141 348 11_C6 142 344 11_B2 143 326 3_C4 144 325 1_G7 145 320 11_D8 146 317 1_C8 147 312 9_G10 148 306 8_B4 149 302 8_C8 150 293 12_D6 151 292 10_F6 152 287 7_G1 153 287 7_G3 154 286 2_G10 155 284 8_B3 156 281 10_B8 157 276 3_C8 158 272 1_G2 159 270 1_D3 160 269 9_F10 161 267 8_D8 162 261 9_G11 163 251 5_C12 164 236 11_C11 165 233 4_C4 166 232 7_G2 167 223 7_H4 168 217 6_A10 169 213 9_H5 170 210 11_A11 171 207 11_A3 172 193 6_B4 173 183 10_G4 174 180 10_H9 175 176 2_H1 176 174 11_D5 177 172 2_B11 178 163 8_F2 179 158 11_B9 180 150 11_B1 181 149 1_G12 182 146 12_C7 183 145 9_B8 184 145 6_D11 185 143 4_C3 186 138 7_G8 187 130 9_C1 188 121 10_B3 189 120 2_G12 190 119 11_D10 191 114 10_H7 192 113 5_F8 193 112 11_C10 194 111 8_B7 195 111 2_C9 196 110 1_H11 197 109 11_B4 198 109 10_F7 199 104 8_F3 200 104 9_H6 201 100 4_G3 202 98 5_G8 203 98 7_C9 204 93 1_G6 205 92 11_C12 206 89 11_D3 207 88 2_C4 208 84 5_G2 209 83 1_C4 210 80 5_G10 211 79 1_C10 212 78 11_A4 213 75 3_G7 214 74 12_D11 215 73 7_G6 216 73 8_C11 217 73 11_B3 218 71 6_F8 219 70 9_C10 220 68 1_B10 221 67 1_C11 222 67 5_H9 223 67 9_B4 224 66 11_D7 225 65 2_G1 226 64 10_C11 227 60 12_D12 228 58 9_G2 229 56 4_D7 230 55 1_H6 231 54 8_H6 232 52 10_H10 233 49 5_D8 234 49 12_D4 235 48 12_C9 236 47 2_F7 237 47 9_F1 238 45 1_H10 239 43 8_G5 240 42 6_C7 241 41 2_G6 242 37 8_H11 243 36 12_D9 244 32 2_F11 245 32 12_C8 246 31 5_G11 247 31 11_C5 248 30 11_D9 249 30 2_D11 250 30 9_F8 251 29 10_B9 252 28 4_D10 253 28 6_B10 254 28 5_C10 255 27 10_F4 256 26 5_D12 257 26 9_G3 258 26 2_D1 259 25 5_H11 260 25 9_C11 261 25 2_B6 262 23 8_G4 263 23 9_B3 264 21 11_B11 265 19 6_B8 266 19 6_F2 267 19 7_D4 268 19 7_H1 269 19 2_B8 270 18 3_D2 271 18 8_D10 272 18 6_F7 273 17 8_G12 274 17 9_C9 275 17 11_A12 276 15 12_B4 277 15 2_C6 278 15 6_D7 279 15 10_H4 280 14 2_H2 281 14 6_F3 282 14 11_C2 283 13 2_F8 284 13 10_G6 285 12 2_B2 286 12 1_F4 287 11 10_H5 288 11 12_D8 289 11 4_H1 290 11 7_G11 291 11 8_C12 292 11 9_H8 293 11 11_A5 294 10 3_C3 295 10 5_F11 296 10 10_C12 297 9 3_H8 298 9 4_B4 299 9 8_H3 300 9 10_B7 301 8 10_C7 302 8 11_A6 303 8 11_D11 304 8 11_D2 305 8 2_F2 306 8 6_B6 307 8 6_B9 308 8 7_C8 309 8 10_A12 310 7 10_D8 311 7 11_D1 312 7 12_A9 313 7 12_C11 314 7 2_B3 315 7 4_H4 316 7 6_B11 317 7 6_B3 318 7 9_B11 319 7 12_A1 320 6 12_A11 321 6 3_G8 322 6 11_A10 323 5 2_G5 324 5 3_D3 325 5 8_G7 326 5 1_C7 327 4 1_D10 328 4 10_H2 329 4 12_C3 330 4 2_B10 331 4 2_D7 332 4 2_H6 333 4 5_F4 334 4 5_G12 335 4 6_D6 336 4 7_B12 337 4 8_D2 338 4 9_D6 339 4 1_A6 340 3 1_C6 341 3 12_B5 342 3 3_H2 343 3 6_G7 344 3 7_A1 345 3 7_D3 346 3 7_H3 347 3 9_C7 348 3 9_E2 349 3 9_F3 350 3 1_C3 351 2 1_F3 352 2 1_G11 353 2 3_D12 354 2 3_H5 355 2 5_B8 356 2 6_A9 357 2 7_E1 358 2 8_H2 359 2 9_D2 360 2 9_E12 361 2 9_E5 362 2 9_E9 363 2 1_D9 364 1 1_G9 365 1 10_D4 366 1 10_E11 367 1 10_E3 368 1 10_E4 369 1 10_E5 370 1 11_B7 371 1 12_B8 372 1 12_B9 373 1 12_D3 374 1 12_D7 375 1 2_F6 376 1 3_C2 377 1 3_G6 378 1 5_A9 379 1 5_B3 380 1 6_B12 381 1 7_F4 382 1 8_B8 383 1 8_C3 384 1 8_G6 385 1 9_B9 386 1 9_E1 387 1 9_E10 388 1 9_H10 389 1

TABLE 5 Eight of the 130 top clones represented epitopes. These clones contained gene fragments in the reading frame of the T7 gene 10 capsid gene and thus expressed the same original peptides coded by the inserted gene fragments. Peptide sequences of Size Description of the Mimotopes of sequences that are Description of the genes that are in-frame with pep- in-frame with T7 Rank Clone in-frame with T7 10B gene T7 10B gene tide 10B gene 4 10_F12 gi|19717685|gb|AF487338.1| AEMFPEG 51 gi|20503274|gb|AAL96264.2| H sapiens multiple myeloma AGPYVDL Multiple myeloma overexpression gene 2 DEAGGST overexpression (MYEOV2) mRNA GLLMDLA gene 2 Length = 452, ANEKAVH Score = 698 bits(352) ADFFNDF Expect = 0 EDLFDDD Id = 363/368 (98%) DIQ Gaps = 0/368(0%) 12 5_C8 gi|190805|gb|M37387.1|HUMQBPIC LVSRPFQ 20 gi|190806|gb|AAA60361.1| Human mitochondrial HQASGW Ubiquinone ubiquinone-binding protein MVFENGI binding protein (HQPI) gene, exon 2 TMLQDSI Length = 75 NWG Score = 131 bits(66) Expect = 2e−28 Id = 73/74(98%) Gaps = 1/74(1%) 19 7_G4 gi|78499271|gb|DQ246833.1| PPWSPIVE 27 gi|29690911|gb|AAQ88761.1| H sapiens isolate IND23 LLDAQVA NADH mitochondrion, complete ADPDKLV dehydrogenase genome, ERFELAV subunit 1 Length = 16320 DALSPEV Score = 218 bits (110) YTTYFVT Expect = 2e−54 KTLLLTS Id = 118/121(97%) LFL Gaps = 0/121 (0%) Product = NADH dehydrogenase subunit 1 40 10_G7 gi|19717685|gb|AF487338.1| AEMFPEG 51 gi|20503274|gb|AAL96264.2| H sapiens multiple myeloma AGPYVDL Multiple myeloma overexpression gene 2 DEAGGST overexpression (MYEOV2) mRNA GLLMDLA gene 2 Length = 452, ANEKAVH Score = 698 bits(352) ADFFNDF Expect = 0 EDLFDDD Id = 363/368 (98%) DIQ Gaps = 0/368(0%) 41 5_C11 gi|190805|gb|M37387.1|HUMQBPIC LVSRPFQ 20 gi|190806|gb|AAA60361.1| Human mitochondrial HQASGW Ubiquinone ubiquinone-binding protein MVFENGI binding protein (HQPI) gene, exon 2 TMLQDSI Length = 75 NWG Score = 131 bits(66) Expect = 2e−28 Id = 73/74(98%) Gaps = 1/74(1%) 66 10_G12 gi|1633547|gb|U47924.1|HSU47924 GRKGLEF 48 gi|19923951|ref|NP612434.1 Human cromosome 12p13 ARLVKSY C10 protein sequence, complete sequence EAQDPEI Length = 222930 ASLSGKL Features in this part of subject KALFLPP sequence: C10 MTLPPHG Score = 430 bits (217) PAAGGSV Expect = 8e−118 AAS Id = 224/227 (98%) Gaps = 0/227 (0%) 81 1_C12 gi|5360992|emb|AL023494.12|HS366L4 GWSAMA 28 gi|46409510|ref|NP997326.1| Human DNA sequence from QSWLTAT hypothetical clone RP3-366L4 on STSRVQVI protein chromosome 22q11.23-12.2, LLPQPPE LOC400242 Length = 37658 Score = 309 Bits (156) Expect = 6e−82 Id = 156/156 (100%) Gaps = 0/156 (0%) Strand = Plus/Minus 111 11_A2 gi|19717685|gb|AF487338.1| AEMFPEG 51 gi|20503274|gb|AAL96264.2| H sapiens multiple myeloma AGPYVDL Multiple myeloma overexpression gene 2 DEAGGST overexpression (MYEOV2) mRNA GLLMDLA gene 2 Length = 452 ANEKAVH Score = 698 bits(352) ADFFNDF Expect = 0 EDLFDDD Id = 363/368 (98%) DIQ Gaps = 0/368(0%) Rank Region of similarity of peptide 4 Score = 168 bits (389), Expect = 3e−42 Id = 51/51 (100%), Pos = 51/51 (100%) Gaps = 0/51 (0%) Query2 EMFPEGAGPYVDLDEAGGSTGLLMDLAANEKAVHADFFNDFEDLFDDDDIQ        EMFPEGAGPYVDLDEAGGSTGLLMDLAANEKAVHADFFNDFEDLFDDDDIQ Sbjct7 EMFPEGAGPYVDLDEAGGSTGLLMDLAANEKAVHADFFNDFEDLFDDDDIQ 12 Score = 72.7 bits (164) Expect = 6e−14 Id = 20/20 (100%) Pos = 20/20 (100%) Gaps = 0/20 (0%) Query 10 ASGWMVFENGITMLQDSINW 29          ASGWMVFENGITMLQDSINW Sbjct  4 ASGWMVFENGITMLQDSINW 23 19 Score = 49.3 bits (116) Expect = 5e−05 Id = 23/27 (85%) Pos = 25/27 (92%) Gaps = 0/27 (0%) Query  27 LAVDALSPEVYTTYFVTKTLLLTSLFL  53           +  DALSPE+YTTYFVTKTLLLTSLFL Sbjct 245 MTYDALSPELYTTYFVTKTLLLTSLFL 271 40 Score = 168 bits (389) Expect = 3e−42 Id = 51/51 (100%) Pos = 51/51 (100%) Gaps = 0/51 (0%) Query2 EMFPEGAGPYVDLDEAGGSTGLLMDLAANEKAVHADFFNDFEDLFDDDDIQ        EMFPEGAGPYVDLDEAGGSTGLLMDLAANEKAVHADFFNDFEDLFDDDDIQ Sbjct7 EMFPEGAGPYVDLDEAGGSTGLLMDLAANEKAVHADFFNDFEDLFDDDDIQ 41 Score = 72.7 bits (164) Expect = 6e−14 Id = 20/20 (100%) Pos = 20/20 (100%) Gaps = 0/20 (0%) Query 10 ASGWMVFENGITMLQDSINW 29          ASGWMVFENGITMLQDSINW Sbjct  4 ASGWMVFENGITMLQDSINW 23 66 Score = 144 bits (333) Expect = 7e−35, Id = 47/48 (97%) Positives = 48/48 (100%) Gaps = 0/48 (0%) Q 11 LEFARLVKSYEAQDPEIASLSGKLGALFLPPMTLPPHGPAAGGSVAAS  58      L+FARLVKSYEAQDPEIASLSGKLGALFLPPMTLPPHGPAAGGSVAAS S 79 LKFARLVKSYEAQDPEIASLSGKLGALFLPPMTLPPHGPAAGGSVAAS 126 81 Score = 82.5 bits (187) Expect = 7e−17 Id = 25/28 (89%) Positives = 26/28 (92%) Gaps = 0/28 (0%) Query   1 GWSAMAQSWLTATSTSRVQVILLPQPPE  28           GWASAMAQSWLT TS S+VQVILLPQPPE Sbjct 121 GWSAMAQSWLTATSTSRVQVILLPQPPE 148 111 Score = 168 bits (389) Expect = 3e−42 Id = 51/51 (100%) Pos = 51/51 (100%) Gaps = 0/51 (0%) Query2 EMFPEGAGPYVDLDEAGGSTGLLMDLAANEKAVHADFFNDFEDLFDDDDIQ        EMFPEGAGPYVDLDEAGGSTGLLMDLAANEKAVHADFFNDFEDLFDDDDIQ Sbjct7 EMFPEGAGPYVDLDEAGGSTGLLMDLAANEKAVHADFFNDFEDLFDDDDIQ Table 6. One hundred twenty-one of the top 130 clones contained peptides which were different from the original proteins coded by the inserted gene fragments. This occured because a) the inserted gene fragment was out of frame was out of frame with the open reading frame of the T7 10B gene (n=61), b) the inserted gene fragment represented untranslated region of known gene (n=18) the inserted gene fragment represented sequences fromunknown gene (n=43 ).

TABLE 6a Peptide se- quences of Mimotopes, Size Description of the genes in-frame of that are in Mimotope with T7 pep- Description of the sequences Region of similarity of peptide Rank Clone clones 10 B gene tide that Mimotopes mimic (only the top 3 sequences are shown)  1 1_H3 gi|13097209|gb|BC00337 KLLHQGAR 24 gi|2135396|pir|S71548 Score = 28.6 bits (60), Expect = 1.1 0.1| RRRGLRTP Homeotic protein pG2 Id = 12/22 (54%), Positives = 14/22 (63%) Homo sapiens cystatin B ASVPISPS gi|85726204|gb|ABC79623.1| Gaps = 3/22 (13%) (stefin B) (cDNA clone cytokeratin associated Query 4 MGC: 17497 protein LHQGARRRRGLRTPASVPISPS 25 IMAGE: 3453675), gi|4505037|ref|NP_003564.1| L QG RRR  L   +SVP +PS complete cds Latent transforming growth Sbjct 228 Length = 613 factor beta binding protein LQQGGRPRGDL---SSVPTAPS 246 Score = 517 bits (261) 4 Score = 26.1 bits (54), Expect = 9.3 Expect = 1e−144 gi|56206067|emb|CAI23573.1 Identities = 9/11 (81%), Positives = 9/11 (81%), Id = 265/267 (99%) Myc target 1 Gaps = 1/11 (9%) Gaps = 0/267 (0%) gi|1703642|gb|AAB37683.1| Query 7 Strand = Plus/Plus p65 Net1 proto-oncogene ARRRRGLRTPA 17 protein ARRRA LRT A Sbjct 292 ARRRR-LRTAA 301 Score = 24.4 bits (50), Expect = 21 Id = 9/15 (60%), Positives = 9/15 (60%), Gaps = 5/15 (33%) Query 9 RRRRGLRTPASVPIS 23 RRRR     AS PIS Sbjct 95 RRRR-----ASAPIS 104  2 1_H8 gi|20357564|ref|NM_0001 VIQEPG 34 gi|28892|emb|CAA35582.1| Score = 30.3 bits (64), Expect = 0.34 00.2| GRGDKL Unnamed protein product Id = 12/22 (54%), Positives = 14/22 (63%), Homo sapiens cystatin B LHQGAR gi|2135396|pir||S71548 Gaps = 3/22 (13%) (stefin B) (CSTB), RRRCLR Homeotic protein pG2 Query 9 mRNA cystatin B (liver TPASVPI gi|48428276|sp|O43432|1F4G LHQGARRRXGLRTPASVPISPS 30 thiol proteinase SPS 3_HUMAN L QG RRR  L   +SVP +PS inhibitor) Eukaryotic translation Sbjct 227 Length = 674 initiation factor 4 LQQGGRRRGDL---SSVPTAPS 245 Score = 424 bits (214) gamma3(eIF-4-gamma3) Score = 26.9 bits (56), Expect = 4.6 Expect = 6e−116 (eIF-4-gamma II) Id = 15/28 (53%), Positives = 17/28 (60%), Id = 214/214 (100%) gi|2829451|sp|P56182|NNP1 Gaps = 7/28 (25%) Gaps = 0/214 (0%) HUMAN NNP-1 protein Query 1 Strand = Plus/Plus (Novel nuclear protein 1) GRVIQEPGGRGDKLLHQGARR-----RR 23 (Nucleolar protein Nop52) GR  Q PGGRG  LL+ G+RR     RR gi|2190402|emb|CAA73944.1| Sb 880 latent TGF-beta binding GR-QTPGGRGVPLLNVGSRRSQPGQRR 905 protein-4 Score = 25.2 bits (52), Expect = 28 gi|56206067|emb|CAI23573.1 Id = 11/20 (55%), Positives = 13/20 (65%), myc target 1 Gaps = 6/20 (30%) gi|34098712|sp|Q9H0U9|TSY Query 15 L1_HUMANTestis-specific GRGDKLLHQGARRRRGLRTP 34 Y-encoded-like protein 1 GRG     +GAR+RR  RTP gi|1703642|gb|AAB37683.1| Sbjct 424 p65 Net1 proto-oncogene GRG----QRGARQRR--RTP 437 protein  6 10_H gi|75905883|gb|AY96358 NSSVD 5 gi|123291744|emb|CA114853. Score = 18.0 bits (35), Expect = 1284 6 5.2| 2| ADAMTS-like Id = 5/5 (100%), Positives 5/5 (100%), Homo sapiens isolate gi|119617528|gb|EAW97122. Gaps = 0/5 (0%) 14_LOf(Tor65) 1| SLIT-ROBO Rho Query 1 mitochondrion GTPase activating protein NSSVD 5 Length = 16569 1, isoform CRA_a NSSVD Score = 232 bits (117) gi|119591326|gb|EAW70920. Sbjct 541 Expect = 2e−58 1|SP140 nuclear body NSSVD 545 Id = 119/120 (99%) protein, isoform CRA_a Gaps = 0/120 (0%) gi|119620377|gb|EAW99971. Strand = Plus/Minus 1| EH domain binding protein 1, isoform CRA_c 10 2_G8 gi|39992414|gb|BC06441 GSGKIKKSV 20 gi|56202484|emb|CAI21939.1 Score = 24.0 bits (49), Expect = 29 8.1| Homo sapiens LWDRKVGI HBxAg transactivated Id = 9/13 (69%), Positives = 9/13 (69%), FK506 binding protein RKN protein 2 (XTP2) Gaps = 0/13 (0%) 9,63 kDa, mRNA gi|119611307|gb|EAW90901. Query 2 (cDNA 1|BAT2 domain containing SGKIKKSVIMDRK 14 cloneIMAGE: 5750487), 1, isoform CRA_d SG IKK VL D K partial cds gi|29427863|sp|Q8WUM0|N Sbjct 83 Length = 2421 U133_HUMAN SGPIKKPVLRDMK 95 Score = 486 bits (245) Nuclear pore complex Score = 24.0 bits (49), Expect = 41 Expect = 1e−134 protein Nup133 Id = 6/6 (100%), Positives = 6/6 (100%), Id = 259/264 (98%) (Nuclcoporin Nup133) Gaps = 0/6 (0%) Gaps = 0/264 (0%) (133 kDa nucleoporin) Query 8 Strand = Plus/Minus gi|61216828|sp|Q96HF1|SFR SVLWDR 13 P2_HUMAN Secreted SVLWDR apoptosis - related protein 1 Sbjct 101 gi|12803735|gb|AAH02704.1 SVLWDR 106 Signal transducer and Score = 23.1 bits (47), Expect = 73 activator of transcription 1 Id = 6/6 (100%), Positives = 6/6 (100%), gi|29337296|ref|NP_803881.1 Gaps = 0/6 (0%) Melanoma antigen family D, Query 6 4 isoform 2 KKSVLW 11 gi|57012811|sp|Q7Z419|IBR KKSVLW D2_HUMAN Sbjct 233 E3 ubiquitin ligase IBRDC2 KKSVLW 238 (p53- inducible RING finger protein) 11 2_H4 gi|39992414|gb|BC06441 GSRKIXWT 56 Gi|4758932|ref|NP_004563.1| Score = 48.1 bits (106), Expect = 5e−06 8.1| ALWETKVG Plakophilin 2 isoform 2b Id = 15/17 (88%), Positives = 16/17 (94%), Homo sapiens FK506 LCLKLKMD gi|15929032|gb|AAH14974.1| Gaps = 0/17 (0%) binding protein 9, 63 EPCLSHAC VIF1B protein Query 30 kDa, mRNA(cDNA YPNTLGGQ gi|15341934|gb|AAH13155.1| HACYPNTLGGQGGRITR 46 cloneIMAGE: 5750487), GGRITRXR CRYZL1 protein HAC P+TLGGQGGRITR partial cds LRPSWPTQ gi|66932986|ref|NP_0010193 Sbjct 477 Length = 2421 86.1| filamin-binding LIM HACNPSTLGGQGGRITR 493 Score = 307 bits (155) protein-1 isoform b Score = 43.9 bits (96), Expect = 9e−05 Expect = 3e−81 gi|386941|gb|AAA59814.1| Id = 14/17 (82%), Positives = 16/17 (94%), Id = 158/159 (99%) MHC HLA-DR-beta-1 Gaps = 0/17 (0%) Gaps = 0/159 (0%) chain Query 30 Strand = Plus/Minus gi|11038659|ref|NP_067610.1 HACYPNTLGGQGGRITR 46 ADAM metallopeptidase HAC P+TLGG+GGRITR with tlironibospondin type 1 Sbjct 270 gi|62088182|dbj|BAD92538.1 HACNPSTLGGRGGRITR 286 SLC2A11 protein variant Score = 41.4 bits (90), Expect = 5e−04 gi|33356179|ref|NP_031370.2 Id = 13/16 (81%), Positives = 14/16 (87%), transcription termination Gaps = 0/16 (0%) factor, RNA polymerase 1 Query 30 gi|7416053|dbj|BAA93676.1| HACYPNTLGGQGGRIT 45 survivin-beta H C P+TLGGQGGRIT gi|3335138|gb|AAC39892.1| Sbjct 98 RNA polyinerase 1 40 kD HVCNPSTLGGQGGRIT 113 gi|522145|gb|AAB02649.1| B-cell growth factor 15 6_G1 gi|33876180|gb|BC00141 LAQLQHGK 15 gi|38372397|sp|Q9ULK0|GR1 Score = 23.5 bits (48), Expect = 54 1 0.2| Homo sapiens S100 NLQPYRD D1_HUMAN Glutamate Id = 8/12 (66%), Positives = 8/12 (66%), calcium binding protein receptor delta-1 subunit Gaps = 3/12 (25%) A11 (calgizzarin), precursor (GluR delta-1) Query 4 mRNA (cDNA clone gi|2499758|sp|Q92729|PTPR LQHGKNLQPYRD 15 MGC: 2149 U_HUMAN LQHG    PYRD IMAGE: 3140092), Receptor type tyrosine Sbjct 774 complete cds protein phosphatase U LQHGS---PYRD 782 Length = 568 precursor Score = 22.3 bits (45), Expect = 95 Score 844 bits (426) (Pancreatic carcinoma Id = 6/8 (75%), Positives = 7/8 (87%), Expect = 0.0 phosphatase 2) (PCP-2) Gaps = 0/8 (0%) Id = 430/432 (99%) gi|2441904|jgb|AAL65133.2| Query 8 Gaps = 0/432 (0%) Ovarian cancer related KNLQPYRD 15 Strand = Plus/Plus tumor marker CA125 KNL PYR+ /gene = “100A11” gi|30060232|gb|AAP13073.1| Sbjct 451 E3 ligase for inhibin KNLLPYRN 458 receptor Score = 22.3 bits (45), Expect = 95 gi|5053128|gb|AAD33053.2| Id = 9/14 (64%), Positives = 10/14 (71%), Scar2 Gaps = 1/14 (7%) gi|59800456|sp|Q9Y6W5|WA Query 1 SF2_HUMAN LAQLQHG-KNLQPY 13 Wiskott-Aldrich syndrome L+QL HG K L PY protein family member 2 Sbjct 13105 (WASP-family protein LSQLTHGIKELGPY 13118 member2) (WAVE-2 Score 22.3 bits (45), Expect = 95 protein) Id = 9/14 (64%), Positives = 10/14 (71%), Gaps = 1/14 (7%) Query 1 LAQLQHG-KNLQPY 13 L+QL HG K L PY Sbct 14041 LSQLTHGIKELGPY 14054 Score = 19.7 bits (39), Expect = 556 Id = 10/19 (52%), Positives = 11/19 (57%), Gaps = 4/19 (21%) Query 1 LAQLQHG-KNLQPY---RD 15 L+QL HG   L PY   RD Sbjct 20748 LSQLTHGITELGPYTLDRD 20766 Score = 18.0 bits (35), Expect = 1802 Id = 8/14 (57%), Positives = 9/14 (64%), Gaps = 1/14 (7%) Query 1 LAQLQHG-KNLQPY 13 L+QL HG   L PY Sbjct 15291 LSQLTHGITELGPY 15304 21 10_G gi|17390908|gb|BC01838 NSSL 4 gi|125987841|sp|Q99102|MU Score = 14.6 bits (27), Expect = 10793 11 6.1| Homo sapiens C4_HUMAN Id = 4/4 (100%), Positives = 4/4 (100%), cytochrome c oxidase Pancreatic adenocarcinoma Gaps = 0/4 (0%) subunit VIIb, mRNA mucin Query 1 (cDNA clone MGC: 9065 gi|125863935|sp|Q8N584|CR NSSL 4 IMAGE: 3861730) 017_HUMAN NSSL Length = 454 TPR repeat-containing Sbjct 44 Score = 549 bits (277), protein C18orf17 NSSL 47 Expect = e−153 gi|121944808|emb|CAK5073 Id = 316/328 (96%), 5.1| immunoglobulin A Gaps = 3/328 (0%) heavy chain variable region Strand = Plus/Plus gi|120431468|gb|ABM21709. tissue_type = “Ovary, 1| env protein adenocarcinoma” gi|119943116|ref|NP_001073 /gene = “COX7B” 328.1| G protein-coupled /product = “cytochrome c receptor 64 isoform 2 oxidase subunit VIIb, precursor” 22 11_C gi|4503528|ref|NM_00141 VDSRTRSM 19 gi|47458813|ref|NP_006242.4 Score = 28.6 bits (60), Expect = 1.1 4 6.1| TYSKSSTAT Protein kinase, AMP- Id = 14/27 (51%), Pos = 16/27 (59%), Eukaryotic translation PR activated, a-1 catalytic Gaps = 9/27 (33%) initiation factor 4A, subunit isoform 1 Query 1 isoform I (EIF4A1) (PRKAA1) VDSRT-----RSM----TYSKSSTATP 18 mRNA gi|47605556|sp|Q9NYF8|BCL VDSRT     RS+    T +KS TATP Length = 1383 F1_HUMAN Bcl-2 Sbjct471 Score = 371 bits(187) associated transcriptional VDSRTYLLDSRSIDDEITEAKSGTATP 497 Expect = 5e−100 factor (Btl) Score = 25.7 bits (53), Expect = 9.0 Id = 199/205(97%) gi|8039798|sp|P30414|NKCR Id = 8/11 (72%), Pos = 10/11 (90%), Gaps = 0/205(0%) _HUMAN NK-tumor Gaps = 0/11 (0%) Strand = Plus/Plus recognition protein Query 3 gi|5870841|gb|AAA35734.2| SRTRSMTYSKS 13 cyclophilin-related SR+RS TYS+S protein Sbjct 36 gi|24419041|gb|AAL65133.2| SRSRSRTYSRS 46 Ovarian cancer related Score = 23.5 bits (48), Expect = 52 tumor marker CA125 Id = 9/18 (50%), Pos = 12/18 (66%), gi|45768720|gb|AAH67812.1| Gaps = 6/18 (33%) Cyclin L1 Query 10 RTRSMTY-----------SK SST 21 RTRS++Y    S+  SST Sbjct 1328 RTRSVSYSHSRSRSRSST 1345 26 1_D7 gi|57165047|gb|AY86482 LPRVQAQG 64 gi|148429165|sp|Q86V8|THO Score = 36.7 bits (79), Expect = 0.016 4.1| GRVPEETE C4_HUMAN Id = 19/37 (51%), Positives = 21/37 (56%), Homo sapiens CDC37 GAGGGRGR THO complex subunit 4 Gaps = 14/37 (37%) cell division cycle 37 QGRAGAPA (Tho4) (Transcriptional Query 8 homolog (S. cerevisiae) GRGTAAAQ coactivator Aly/REF) GGRVPEETEGAGGGRGRQGRAGAPAGRGTAAAQGGAE 44 (CDC37) gene, complete GGAELGAE gi|47117879|sp|P83369|LSM1 GGR        GGGRCR GRAG+  GRG     GGA+ cds AGGDAQEG 1_HUMAN Sbjct 21 Length = 16460 SLRPHSSN U7 snRNA-associated Sm- GGR--------GGGRGR-GRAGSQGGRG-----GGAQ 43 Score = 333 bits (168) like protein LSm11 Score = 35.0 bits (75), Expect = 0.069 Expect = 4e−89 gi|62087836|dbj|BAD92365.1 Identities = 17/24 (70%), Positives = 17/24 (70%), Id = 168/168 (100%) serum response factor Gaps = 2/24 (8%) Gaps = 0/168 (0%) (c-fos serum response Query 19 Strand = Plus/Plus element-binding GGGRGRQGRA-GAPAGRGTAAAQG 41 transcriptional factor) GGGRGR GRA GA AG G AA  G variant Sbjct 69 gi|46397045|sp|Q9NQ03|SCR GGGRGR-GRARGAAAGSGVPAAPG 91 T2_HUMAN Score = 28.6 bits (60), Expect = 4.3 Transcriptional repressor Identities = 24/58 (41%), Positives = 24/58 (41%), scratch 2 Gaps = 17/58 (29%) Q6 AQGGRVPEETEGAGGGRGRQGRAGAPAGRGTAAAQGGA---------------ELGAE 48 A GGRVP    GAG G GR  R  A A   T A   GA               ELGAE S49 ANGGRVPG--NGAGLGPGRLEREAAAAAATTPAPTAGALYSGSEGDSESGEEEELGAE 104 27 10_C gi|31107|emb|Z11692.1|H QTTPGDCP 14 gi|135959|sp|P19438|TNR1A Score = 24.0 bits (49), Expect = 41 9 SEF2MR H.sapiens DTATLP _HUMAN Id = 6/7 (85%), Positives = 7/7 (100%), mRNA for elongation Tumor necrosis factor Gaps = 0/7 (0%) factor 2 receptor superfamily Query 3 Length = 3080 member 1A precursor TPGDCPD 9 Score = 848 bits (428) (p60) (TNF-R1) (TNF-R1) TPGDCP+ Expect = 0.0 (TNFR-1) (p55) (CD120a Sbjct 300 Id = 430/431 (99%) antigen) TPGDCPN 306 Gaps = 0/431 (0%) Tumor necrosis factor- Score = 23.1 bits (47), Expect = 74 Strand = Plus/Plus binding protein 1 (TBP1) Id = 6/6 (100%), Positives = 6/6 (100%), gi|119587880|gb|EAW67476. Gaps = 0/6 (0%) 1| Cas-Br-M (murine) Query 3 ecotropic retroviral TPGDCP 8 transforming sequence TPGDCP gi|119574887|gb|EAW54502. Sbjct 514 1| ubiquitin specific TPGDCP 519 peptidase 54 Score = 23.1 bits (47), Expect = 74 gi|115855|sp|P22681|CBL_(—) Id = 11/17 (64%), Positives = 11/17 (64%), HUMAN Gaps = 5/17 (29%) E3 ubiquitin-protein ligase Query 2 (Proto-oncogene c-CBL) TTPGDC-PD---TATLP 14 (Casitas B-lineage TTPG C P    TATLP lymphoma proto-oncogene) Sbjct 1375 TTPG-CNPQLTYTATLP 1390 29 9_G4 gi|71482588|ref|NM_0010 NSSV 4 gi|126215737|sp|Q1L5Z9|LO Score = 14.6 bits (27), Expect = 10793 20.4| NF2 HUMAN Id = 4/4 (100%), Positives = 4/4 (100%), Homo sapiens ribosomal Neuroblastoma apoptosis- Gaps = 0/4 (0%) protein S16 (RPS16), related protease Query 1 mRNA gi|120587027|ref|NP_065769. NSSV 4 Length = 603 3| ubiquitin specific NSSV Score = 260 bits (131) peptidase 31 Sbjct 206 Expect = 4e−67 gi|119626680|gb|EAX06275.1 NSSV 209 Id = 131/131 (100%) alpha-kinase 1 Gaps = 0/131 (0%) gi|119620124|gb|EAW99718. Strand = Plus/Plus 1| dual specificity phosphatase 11 31 11_B gi|74027260|ref|NM_0068 ERGKKRKI 35 gi|57863277|ref|NP_0010099 Score = 29.9 bits (63), Expect = 0.55 12 46.2| REMLQDM 21.1| hypothetical protein Id = 10/18 (55%), Positives = 13/18 (72%), Homo sapiens serine VPVVVEEE LOC23355 isoform a Gaps = 3/18 (16%) peptidase inhibitor, TLRTNVLSI gi|14042300|dbj|BAB55190.1| Query 7 Kazal type 5 (SPINK5), GNK unnamed protein product KIREM---LQDMVPVVVE 21 mRNA gi|50083293|ref|NP_116205.3 KI+ M    QDMVPV+V+ Length = 3610 hypothetical protein Sbjct 569 Score = 216 bits (109), LOC84902 KIQVMEQHFQDMVPVIVD 586 Expect = 1e−53 gi|2815622|gb|AAC39562.1| Score = 29.1 bits (61), Expect = 0.98 Id = 109/109 (100%), kinesin-related protein Id = 12/25 (48%), Positives = 14/25 (56%), Gaps = 0/109 (0%) gi|2407245|gb|AAB70531.1| Gaps = 10/25 (40%) Strand = Plus/Plus putative transcription factor Query 6 CR53 RKIREMLQDMVPVVVEEETL-RTNV 29 gi|38605529|sp|Q9Y4A5|TRR R+IRE+LQD         TL RT V AP_HUMAN Sbjct730 Transformation/transcription RRIRELLQD---------TLTRTGV 745 domain-associated protein Score 28.6 bits (60), Expect = 1.3 (350/400 kDa PCAF- Id = 12/24 (50%), Positives = 17/24 (70%), associated factor) Gaps = 4/24 (16%) gi|14589891|ref|NP_001784.2 Query 5 cadherin 3, type 1 KRKIREMLQDMVPVVVEEET--LR 26 preproprotein KR+ REM Q+M  ++ +EET  LR gi|226518|prf||1516312A Sbjct 1 Ca dependent cell adhesion KRREREMQQEM--MLRDEETMELR 22 protein gi|4099609|gb|AAD00657.1| cell division control-related protein 2b 32 2_C1 gi|14506788|ref|NM_00297 RKGKKSKR 14 gi|57209583|emb|CA141374.1 Score = 25.7 bits (53), Expect = 9.1 0 0.1| RKWLNS Sarcoma antigen 1 Identities = 10/20 (50%), Positives = 12/20 (60%), Homo sapiens gi|8216987|emb|CAB92443.1| Gaps = 7/20 (35%) spermidine/spermine Putative tumor antigen Query 1 N1-acetyltransferase gi|23396625|sp|Q9NQT8|K11 RKGKK---SKRRK----WLN 13 (SAT), mRNA 3B_HUMAN RK K+   SKRRK    WL+ Length = 1060 Kinesin-like protein Sbjct 51 Score = 1031 bits(520) KIF13B (Kinesin-like RKSKRHSSSKRRKSMSSWLD 70 Expect = 0.0 protein GAKIN) Score = 24.8 bits (51), Expect = 22 Id = 520/520(100%) gi|7227890|sp|O24175|FL_O Id = 6/6 (100%), Positives = 6/6 (100%), Gaps = 0/520(0%) RYSA Gaps = 0/6 (0%) Strand = Plus/Plus Putative transcription Query 8 factor FL (RFL) RRKWLN 13 gi|51701343|sp|Q8TD10|CHD RRKWLN 5_HUMAN Sbjct 1092 Chromodomain helicase- RRKWLN 1097 DNA-binding protein Score = 24.4 bits (50), Expect = 30 5(CHD-5) Id = 7/10 (70%), Positives = 10/10 (100%), gi|126178|sp|P14151|LYAM1 Gaps = 0/10 (0%) _HUMAN Query 1 L-selectin precursor RKGKKSKRRK 10 (Lymph node homing RKGKK++R+K receptor) (Leukocyte Sbjct 172 surface antigen Leu- RKGKKARRKK 181 8) (TQ1) (gp90-MEL) (Leukocyte- endothelial cell adhesion molecule 1) (LECAM1) (CD62L antigen) 35 11_C gi|45359854|ref|NM_0020 TKAK 4 gi|54607086|ref|NP_068756.2 Score = 14.6 bits (27), Expect = 7683 8 78.3| elongation factor for Id = 4/4 (100%), Positives = 4/4 (100%), Homo sapiens golgi selenoprotein translation Gaps = 0/4 (0%) autoantigen, golgin gi|57863304|ref|NP_056340.2 Query 1 subfamily a, 4 inhibitor of Bruton's TKAK 4 (GOLGA4), mRNA tyrosine hinase TKAK Length = 7717 gi|52548190|gb|AAU82085.1| Sbjct 318 Score = 270 bits (136) tumor amplified and TRAK 321 Expect = 1e−69 overexpressed sequence 2 Id = 138/139 (99%) gi|20800447|gb|AAF23433.3| Gaps = 0/139 (0%) breast cancer-associated Strand = Plus/Minus antigen BRCAA1 37 8_C4 gi|92443613|gb|BC01555 PNSG 4 gi|124001556|ref|NP_001028 Score = 15.1 bits (28), Expect = 8044 9.1| Homo sapiens 685.2|G protein-coupled Id = 4/4 (100%), Positives = 4/4 (100%), ribosomal protein S24, receptor 34 isoform 2 Gaps = 0/4 (0%) mRNA (cDNA clone gi|122892256|gb|ABM67195. Query 1 IMAGE: 3635225), 1| immunoglobulin heavy PNSG 4 Length = 3566 chain variable region PNSG Score = 680 bits (343) gi|122939151|ref|NP_001073 Sbjct 225 Expect = 0.0 626.1| Rho GTPase PNSG 228 Id = 343/343 (100%) activating protein 9 isoform Gaps = 0/343 (0%) 2 Strand = Plus/Plus 38 10_C >gi|19913404|ref|NM_00 NSSA 4 gi|124001558|ref|NP_689799. Score = 14.2 bits (26), Expect = 14482 4 3286.2| 3| ubiquitin specific Id = 4/4 (100%), Positives = 4/4 (100%), Homo sapiens protease 54 Gaps = 0/4 (0%) topoisomerase (DNA) I gi|122892053|gb|ABM67094. Query 1 (TOP1) 1| mutant lipase H NSSA 4 Score = 1003 bits (506) gi|120660404|gb|AAI30523.1| NSSA Expect = 0.0 Receptor tyrosine kinase- Sbjct 2051 Id = 561/585 (95%) like orphan receptor 2 NSSA 2054 Gaps = 1/585 (0%) gi|120660344|gb|AA130362.1| Strand = Plus/Plus BRCC2 gi|114432132|gb|AB174674.1| breast and ovarian cancer susceptibility protein 2 truncated variant 39 10_G gi|62897624|dbj|AK22303 RRLKKFCIT 9 gi|60549585|gb|AAX24102.1| Score = 25.2 bits (52), Expect = 11 5 2.1| cation-transporting P5- Id = 8/11 (72%), Positives = 8/11 (72%), Homo sapiens mRNA ATPase Gaps = 3/11 (27%) for beta actin variant, gi|33469249|gb|AAQ19673.1| Query 1 clone: JTH03396 general transcription factor RRLKK---FCI 8 Length = 1820 II i repeat domain 2 RRLKK   FCI Score = 309 bits (156) gi|57209558|emb|CAI41998.1 Sbjct 459 Expect = 4e−82 melanoma associated RRLKKRGIFCI 469 Id = 156/156 (100%) antigen (mutated) 1-like 1 Score = 23.5 bits (48), Expect = 36 Gaps = 0/156 (0%) gi|37932158|gb|AAP72185.1| Id = 6/6 (100%), Positives = 6/6 (100%), Strand = Plus/Minus migration-inducing protein 3 Gaps = 0/6 (0%) gi|13477103|gb|AAH05005.1| Query 3 Lung cancer-related protein 8 LKKFCI 8 gi|6919941|sp|Q15113|PCOL LKKFCI C_HUMAN Sbjct 603 Procollagen C-proteinase LKKFCI 608 enhancer protein precursor Score = 22.7 bits (46), Expect = 65 gi|61966767|ref|NP_0010136 Id = 6/6 (100%), Positives = 6/6 (100%), 82.1| stromal cell derived Gaps = 0/6 (0%) factor receptor 2 homolog Query 1 gi|74759724|sp|Q8IYT8|ULK RRLKKF 6 2_HUMAN RRLKKF Serine/threonine-protein Sbjct 441 kinase ULK2 RRLKKF 446 42 1_D1 gi|13097335|gb|BC00341 IFAGGGGP 26 gi|18204601|gb|AAH21228.1| Score = 31.6 bits (67), Expect = 2.9 2 8.1| AGPGRAGT FLJI0700 protein Id = 18/32 (56%), Positives = 18/32 (56%), Homo sapiens cyclic GGGRVASA gi|12644292|sp|P49840|GSK3 Gaps = 10/32 (31%) AMP phosphoprotein, TR A_HUMAN Glycogen Query 2 19 kD, mRNA (cDNA synthase kinase-3 alpha FAGG---GGP----AGPG---RAGTGGGRVAS 23 clone MGC: 5468 (GSK-3 alpha) F GG   G P    A PG   RAGTGGG VAS IMAGE: 3451558), gi|6291532|dbj|BAA86298.1| Sbjct475 complete cds Cbl-c FHGGQPTGAPLDCAAAPGAHYRAGTGGGPVAS 506 Length = 1228 gi|46397888|sp|Q9ULV8|CB Score = 29.1 bits (61), Expect = 1.1 Score = 335 bits (169) LC_HUMAN Signal Id = 15/24 (62%), Positives = 17/24 (70%), Expect = 4e−89 transduction protein CBL-C Gaps = 7/24 (29%) Id = 169/169(100%) gi|20149596|ref|NP_036248.2 Query 10 Gaps = 0/169(0%) Cas-Br-M (murine) GGGGP----AGPGRAGTGGGRVAS 29 Strand = Plus/Plus ecotropic retroviral GGGGP    +GPG  GTGGG+ AS transforming sequence c Sbjct 32 gi|56206355|emb|CAI18503.1 GGGGPGGSASGPG--GTGGGK-AS 52 HLA-B associated Score = 27.4 bits (57), Expect = 2.7 transcript 3 Id = 11/15 (73%), Positives = 12/15 (80%) gi|5453736|ref|NP_005351.2| Gaps = 2/15 (13%) v-maf musculoaponeurotic Query 3 fibrosarcoma oncogene AGGGGPAGPGRAGTG 17 homolog isoform a AGGGGP GPG  G+G gi|21759253|sp|O75444|MAF Sbjct 65 _HUMAN AGGGGPGGPG--GSG 77 Transcription factor Maf (Proto-oncogene c-mat) gi|47078300|ref|NP_542159.2 apoptosis related protein 3 isoform b gi|66932999|ref|NP_006276.2 testis-specific protein kinase 1 gi|2499660|sp|Q15569|TESK 1_HUMAN Dual-specificity testis- specific protein kinase 1 44 10_D gi|3329373|gb|AF038952. VKIMTLKS 17 gi|56417662|emb|CAI19057.1 Score = 23.1 bits (47), Expect = 53 12 1|AF038952 RQRSYKNP Novel protein Id = 7/8 (87%), Positives = 7/8 (87%), Homo sapiens cofactor G gi|16923934|gb|AAL31642.1| Gaps = 0/8 (0%) A protein mRNA, NOV1 Query 5 complete cds gi|40675561|gb|AAH64981.1| TLKSRQRS 12 Length = 574 Putative NEkB activating TLK RQRS Score = 392 bits (198) protein Sbjct 152 Expect = 2e−106 gi|51701611|sp|Q9BX67|JAM TLKERQRS 159 Id = 202/204 (99%) 3_HUMAN Junctional Score = 23.1 bits (47), Expect = 53 Gaps = 0/204(0%) adhesion molecule c Id = 9/15 (60%), Positives = 12/15 (80%), Strand = Plus/Plus precursor (JAM-C) Gaps = 1/15 (6%) gi|38570359|gb|AAR24620.1| Query 1 Migration-inducing gene 13 VKIMTLKSRQRSYKN 15 Deoxyhypusine synthase VK++TLK R+ S KN Sbjct 39 VKLLTLKPRETS-KN 52 Score = 23.1 bits (47), Expect = 53 Id = 7/9 (77%), Positives = 8/9 (88%), Gaps = 0/9 (0%) Query 6 LKSRQRSYK 14 L+SRQR YK Sbjct 260 LQSRQRDYK 268 45 4_G8 gi|56788032|gb|AY69246 RPARSRRM 14 gi|57163116|emb|CAI39857.1 Score = 26.5 bits (55), Expect = 5.1 4.1| Growth-ihibiting MAWGKA Protein (peptidyl-prolyl Id = 7/8 (87%), Positives = 7/8 (87%), gene 46 mRNA cis/trans isomerase) NIMA- Gaps = 0/8 (0%) Length = 1715 interacting, 4 (parvulin) Query 1 Score = 823 bits (415) gi|32171215|ref|NP_859066.1 RPARSRRM 8 Expect = 0.0 Transducer of regulated RP RSRPM Id = 415/415 (100%) cAMP response element Sbjct 14 Gaps = 0/415 (0%) binding protein RPERSRRM 21 Strand = Plus/Minus gi|37574639|gb|AAQ93056.1| Score = 23.5 bits (48), Expect = 40 Antigen MLAA-10 Id = 6/7 (85%), Positives = 6/7 (85%), gi|59800455|sp|Q9UPY6|WA Gaps = 0/7 (0%) SF3_HUMAN Query 6 Wiskott-Aldrich syndrome RRMMAWG 12 protein family member 3 RR MAWG (WASP-family protein Sbjct 144 member 3) RRTMAWG 150 gi|17368482|sp|Q15773|MLF Score = 23.1 bits (47), Expect = 53 2_HUMAN Id = 8/12 (66%), Positives = 9/12 (75%), Myeloid leukemia factor 2 Gaps = 2/12 (16%) gi|10720185|sp|Q9Y4L1|OXR Query 2 P_HUMAN PARSR-RMM-AW 11 150 kDa oxygen-regulated PA+R  RMM AW protein precursor (Orp150) Sbjct 46 (Hypoxia up-regulated 1) PAQPRHRMMSAW 57 47 5_H8 gi|33342277|ref|NM_0252 DWRVPR 36 gi|3647275|emb|CAA12110.1 Score = 30.8 bits (65), Expect = 0.31 04.2| PAPHHR Matrilin-3 Id = 11/17 (64%), Positives = 11/17 (64%), Hypothetical protein LGARRL gi|3252872|gb|AAC24200.1| Gaps = 5/17 (29%) PP2447, mRNA PNLHAA BRCA1-associated protein 2 Query 5 Score = 232 bits (117) PGRAAP gi|60390212|sp|Q969F8|KISS PRPAPHHRLGARRLPNL 21 Expect = 3e−58, RAASGL R_HUMAN PRPAP     ARRLP L Id = 117/117 (100%) KiSS-1 receptor (KiSS-1R) Sbjct 2 Strand = Plus/Plus (Metastin receptor) PRPAP-----ARRLPGL 13 gi|21265064|ref|NP_620688.1 Score = 30.3 bits (64), Expect = 0.41 ADAM metallopeptidase Id = 18/40 (45%), Positives = 18/40 (45%), with thrombospondin type 1 Gaps = 16/40 (40%) motif, 17 preproprotein 2 gi|51094804|gb|EAL24050.1| WRVPRPAPHHRLGARRLPNLHAA-----PGRAAPRAASGL 36 cAMP responsive element W  PRP        R LP L  A     PGRAA R AS L binding protein 3-like 2 566 Basic transcriptional factor W--PRP--------RALP-LRGAVGSCPPGRAAARGASDL 594 2 Score = 29.9 bits (63), Expect = 0.59 Id = 19/41 (46%), Positives = 19/41 (46%), Gaps = 17/41 (41%) 2 WRVPRPAPHHRLGARRLPNLHAAPG-R---------AAPRA 32 WRV RP   H      LP L AAPG R         AAPRA 41 WRV-RPDDVH------LPPLPAAPGPRRRRRPRTPPAAPRA 74 48 8_G1 Homo sapiens NSSCSE 6 gi|119629922|gb|EAX09517.1 Score = 19.3 bits (38), Expect = 638 1 prothymosin, alpha PBX|knotted 1 homeobox 1, Id = 5/6 (83%), Positives = 6/6 (100%), (gene sequence 28), isoform CRA_a Gaps = 0/6 (0%) mRNA (cDNA gi|91992160|ref|NP_055196. Query 1 clone MGC: 45641 mutL homolog 3 isoform 2- NSSCSE 6 IMAGE: 4335961), gi|6689928|gb|AAF23904.1|A +SSCSE complete cds F195657_1 DNA mismatch Sbjct 323 Length = 1213 repair protein DSSCSE 328 Score = 97.6 bits (49), gi|119623371|gb|EAX02966.1 Score = 19.3 bits (38), Expect = 638 Expect = 6e−18 scavenger receptor class F, Id = 5/6 (83%), Positives = 6/6 (100%), Identities = 49/49 (100%) member 2, isoform CRA b Gaps = 0/6 (0%) Gaps = 0/49 (0%) gi|6648106|sp|Q05086|UBE3 Query 1 Strand = Plus/Minus A_HUMAN Ubiquitin- NSSCSE 6 protein ligase E3A NS+CSE (Renal carcinoma antigen Sbjct 83 NY-REN-54) NSTCSE 88 gi|11385658|gb|AAG34910.1| Score = 19.3 bits (38), Expect = 638 AF273050_1 Id = 5/6 (83%), Positives = 6/6 (100%), CTCL tumor antigen se37-2 Gaps = 0/6 (0%) gi|1872514|gb|AAB49301.1| Query 1 E6-associated protein E6- NSSCSE 6 AP|ubiguitin-protein ligase N+SCSE Sbjct 102 NNSCSE 107 49 1_D2 gi|20357564|ref|NM_0001 RGDKLLHQ 27 gi|28892|emb|CAA35582.1| Score = 28.6 bits (60), Expect = 1.6 00.2| Homo sapiens GARRRRGL unnamed protein product Id = 12/22 (54%), Positives = 14/22 (63%), cystatin B (stefin B) RTPASVPIS gi|2135396|pir||S71548 Gaps = 3/22 (13%) (CSTB), mRNA PS homeotic protein pG2- Query 6 Length = 674 human LHQGARRRRGLRTPASVPISPS 27 Score = 232 bits (117) gi|38648796|gb|AAH63316.1| L QG RRR  L   +SVP+ PS Expect = 9e−59 FBXL17 protein Sbjct 228 Id = 117/117 (100%) LQQGGRRRGDL---SSVPTAPS 246 Gaps = 0/117 (0%) Score = 28.2 bits (59), Expect = 2.1 Strand = Plus/Plus Id = 11/18 (61%), Positives = 11/18 (61%), Gaps = 5/18 (27%) Query 11 RRRRGL-----RTPASVP 23 RRRR L     RTPA VP Sbjct 25 RRRRPLLRLPRRTPAKVP 42 52 11_B gi|110624286|dbj|AK2258 RCSSXNRX 35 gi|3184264|gb|AAC18917.1| Score = 30.8 bits (65), Expect = 0.42 8 50.1| Homo sapiens GREGCPRS F02569_2 ID = 11/16 (68%), Positives = 13/16 (81%), mRNA for E1B-55 kDa- CGLRNESQ gi|113426694|ret|XP_001130 Gaps = 3/16 (18%) associated protein 5 LHVARCWG 029.1| PREDICTED: Query 8 isoform a variant, clone: LPG hypothetical protein QG--REGCPRSCGLRNE 22 FCC125H07 gi|34533723|dbj|BAC86786.1| QG  REGCP  CGLR++ Length = 3447 unnamed protein product Sbjct 179 Score = 307 bits (155) gi|119568019|gb|EAW47634. QGAREGCP---CGLRHQ 192 Expect = 2e−81 1| fibronectin type III Score = 28.2 bits (59), Expect = 2.5 Id = 158/159 (99%) domain containing 1 Identities = 14/22 (63%), Positives = 15/22 (68%) Gaps = 0/159 (0%) Gaps = 5/22 (22%) Strand = Plus/Plus Query 7 RQG-REGCPRSCGLRNESQLHV 27 R G REGCPR C  R +S LHV Sbjct 33 RPGLPREGCPR-C--R-QSVLHV 50 Score = 27.8 bits (58), Expect = 3.3 Id = 9/12 (75%), Positives = 10/12 (83%), Gaps = 1/12 (8%) Query 7 RQGREG-CPRSC 17 RQGREG C R+C Sbjct 1234 RQGREGACHRAC 1245 55 2_G7 gi|48734768|gb|BC07192 ERPSKRYL 24 gi|51476730|emb|CAH18336. Score = 25.2 bits (52), Expect = 12 8.1| Homo sapiens HQAAGGGE 1| hypothetical protein Id = 12/23 (52%), Positives = 13/23 (56%), ribosomal protein S17, RKERQLCS gi|30354570|gb|AAH51765.1| Gaps = 9/23 (39%) mRNA (cDNA clone Nuclear factor kappa-B, Query 7 MGC: 88613 subunit 1 YLHQAAGGGER------KE--RQ 21 IMAGE: 5090053), gi|23271363|gcb|AAH35512.1| YL QA GGG+R      KE  RQ complete cds WD repeat domain 49 Sbjct 177 Length = 488 gi|31621301|ref|NP_853550.1 YL-QAEGGGDRQLGDREKELTRQ 198 Score = 476 bits(240) regulator of G-protein Score = 25.2 bits (52), Expect = 12 Expect = 7e−132 signalling like 1 Id = 12/23 (52%), Positives = 13/23 (56%), Id = 268/270 (99%) gi|40254949|ref|NP_065960.2 Gaps = 9/23 (39%) Gaps = 1/270(0%) erythrocyte membrane Query 7 Strand = Plus/Plus protein band 4.1 like 5 YLHQAAGGGER------KE--RQ 21 gi|14917040|sp|Q15127|SCA YL QA GGG+R      KE  RQ M2_HUMAN Sbjct 177 Secretory carrier-associated YL-QAEGGGDRQLGDREKELTRQ 198 membrane protein 2 Score = 24.0 bits (49), Expect = 29 gi|57162523|emb|CA140407.1 Id = 8/16 (50%), Positives = 9/16 (56%), transcription factor 7-like 2 Gaps = 5/16 (31%) (T-cell specific, HMG-box) Query 8 LHQAAGGGERKERQLC 23 LH      ERK +QLC Sbjct 675 LHH-----ERKAKQLC 685 58 4_G7 gi|71559137|ref|NM_0010 SSQGHF 6 gi|21753429|dbj|BACO4343.1| Score = 21.8 bits (44), Expect = 78 29.3| unnamed protein product Id = 6/6 (100%), Positives = 6/6 (100%), Homo sapiens ribosomal Roquin (RING finger and Gaps = 0/6 (0%) protein S26 (RPS26), C3H zinc finger protein 1) Query 1 mRNA gi|55960182|emb|CAI17337.1 SSQGHF 6 Length = 699 G protein-coupled receptor SSQGHF Score = 367 bits (185) 123 Sbjct 74 Expect = 1e−98 gi|22749391|ref|NP_689910.1 SSQGHF 79 Id = 202/210 (96%) solute carrier family 44, Score = 19.3 bits (38), Expect = 619 Gaps = 0/210 (0%) member 5 Id = 5/5 (100%), Positives = 5/5 (100%), Strand = Plus/Plus gi|55960275|emb|CAI14751.1 Gaps = 0/5 (0%) novel MAM domain Query 2 containing protein SQGHF gi|73918937|sp|Q8NCS7|CTL Sbjct 74 5_HUMAN Choline SQGHF 78 transporter-like protein Score = 18.0 bits (35), Expect = 1097 Ribosomal protein S6 Id = 5/5 (100%), Positives = 5/5 (100%), kinase 2 (S6K2) Gaps = 0/5 (0%) (Serine/threonine-protein Query 1 kinase 14 beta) SSQGH 5 gi|55664933|emb|CAH71148. SSQGH 1| colony stimulating factor Sbjct 189 1 (macrophage) SSQGH 193 59 1_H2 gi|39992414|gb|BC06441 GSGKIKKSV 20 gi|19923432|ref|NP_054876.2 Score = 28.2 bits (59), Expect = 2.1 8.1| Homo sapiens LWDRKVGI coiled-coil domain Id = 8/12 (66%), Positives = 11/12 (91%), FK506 binding protein RKN containing 113 Gaps = 1/12 (8%) 9,63 kDa, mRNA gi|12053083|emb|CAB66719. Query 7 (cDNA clone 1| hypothetical protein KSV-LNDRKVEI 17 IMAGE: 5750487), gi|115298682|ref|NP_055987. KS+ +W+RKVEI partial cds 2| HBxAg transactivated Sbjct 336 Length = 2421 protein 2 KSIRMWERKVEI 347 Score = 486 bits (245) gi|119594637|gb|EAW74231. Score = 24.8 bits (51), Expect = 23 Expect = 1e−134 1| phospholipase C, beta 3 Id = 10/15 (66%), Positives = 10/15 (66%), Id = 259/264 (98%) (phosphatidylinositol- Gaps = 0/15 (0%) Gaps = 0/264 (0%) specific) Query 2 Strand = Plus/Minus gi|119590306|gb|EAW69900. SGKIKKSVLWDRKVE 16 1| nucleoporin 133 kDa SG IKK VL D K E Sbjct 1013 SGPIKKPVLRDMKEE 1027 Score = 22.7 bits (46), Expect = 98 Id = 7/8 (87%), Positives = 7/8 (87%) Gaps = 0/8 (0%) Query 10 LWDRKVGI 17 L DRKVGI Sbjct 671 LSDRKVGI 678 60 9_G7 gi|119380763|gb|EF17744 ARRWSRST 15 gi|10732604|gb|AAG22468.1| Score = 24.8 bits (51), Expect = 23 7.1| LCRSICL AF193040_1 uknown Id = 7/8 (87%), Positives = 7/8 (87%), Homo sapiens isolate gi|119584289|gb|EAW63885. Gaps = 0/8 (0%) TA23 mitochondrion, 1| tRNA nucleotidyl Query 2 complete genome transferase, CCA-adding, 1, RRWSRSTL 9 Length = 16569 isoform CRA_a RRWS STL Score = 424 bits (214) gi|68566213|sp|Q8NFZ6|VN1 Sbjct 187 Expect = 2e−116 R2_HUMAN RRWSPSTL 194 Id = 214/214 (100%) Vomeronasal type-1 Score = 24.0 bits (49), Expect = 41 Gaps = 0/214 (0%) receptor 2 Id = 7/9 (77%), Positives = 7/9 (77%), Strand = Plus/Plus (Vlr-Iike receptor 2) Gaps = 2/9 (22%) (hGPCR25) Score = 22.3 bits (45), Expect = 133 Putative G-protein coupled Id = 8/10 (80%), Positives = 8/10 (80%), receptor Gaps = 1/10 (10%) Query 5 SRSTLCRSIC 14 SRS LC SIC Sbjct 373 SRSRLC-SIC 381 61 10_H gi|37704380|ref|NM_0040 LIQHQHLG 10 Zinc finger protein 566 Score = 23.1 bits (47), Expect = 74 8 48.2| QI gi|32470620|sp|Q9Y6K5|QAS Id = 6/7 (85%), Positives = 7/7 (100%). Homo sapiens beta-2- 3 HUMAN Gaps = 0/7 (0%) microglobulin (B2M), 2′-5′-oligoadenylate Query 1 mRNA synthetase 3 (p100 OAS) LIQHQHL 7 Length = 987 gi|73620903|sp|Q5SZK8|FRE LIQHQ+L Score = 246 bits (124) M2_HUMAN Sbjct 410 Expect = 2e−62 FRAS1-related extracellular LIQHQNL 416 Id = 124/124 (100%) matrix protein 2 precursor Score = 22.7 bits (46), Expect = 99 Gaps = 0/124 (0%) gi|21263627|sp|Q9NQZ8|ZNF Id = 7/8 (87%), Positives 7/8 (87%), Strand = Plus/Plus 71_HUMAN Gaps = 1/8 (12%) Endothelial zinc finger Query 1 protein induced by tumor LIQ-HQHL 7 necrosis factor alpha LIQ HQHL (Zinc finger protein 71) Sbjct 258 Probable ATP-dependent LIQQHQHL 265 helicase DDX41 Score = 21.4 bits (43), Expect = 174 DEAD-box protein 41 Id = 7/9 (77%), Positives = 8/9 (88%) Gaps = 0/9 (0%) Query 1 LIQHQHLGQ 9 LIQ+ HLGQ Sbjct 1379 LIQYVHLGQ 1367 64 2_D6 Homo sapiens actin, RPHCEL 23 gi|119533|sp|P04626|ERBB2_ Score = 25.2 bits (52) Expect = 15 gamma 1, mRNA (cDNA WGMLAP HUMAN Identities = 6/6 (100%), clone IMAGE: 3461395) TDCCHL Receptor tyrosine-protein Positives = 6/6 (100%) Score = 172 bits (87) HRSSF kinase erbB-2 precursor Query 14 Expect = 2e−40 (p185erbB2)(C-erbB-2) PTDCCH 19 Identities = 87/87 (100%) (NEU proto-oncogene) PTDCCH gi|20139132|sp|Q9BZF2|OSR Sbjct: 232 7_HUMAN PTDCCH 237 Oxysterol binding protein- Score = 15.9 bits (30) Expect = 9714, related protein 7 Identities = 5/6 (63%), gi|13633990|sp|Q9NQE7|TSS Positives = 5/6 (63%) P_HUMAN Thymus-specific Query: 19 serine protease precursor HLHRSS 24 gi|29428029|sp|Q9NW08|RP H HRSS C2_HUMAN DNA-directed Sbjct: 1045 RNA polymnerase III HRHRSS 1050 subunit 127.6 kDa Score = 24.0 bits (49), Expect = 38 polypeptide Id = 8/12 (66%), Pos = 8/12 (66%), gi|2811086|sp|P00533|EGFR_ Gaps = 0/12 (0%) HUMAN Query 10 Epidermal growth factor LAPTDCCHLHRS 21 receptor precursor L PTD  HL RS (Receptor tyrosine-protein Sbjct 291 kinase ErbB-1) LPPTDYAHLQRS 302 Score = 23.5 bits (48), Expect = 51 Id = 6/7 (85%), Positives = 7/7 (100%), Gaps = 0/7 (0%) Query 6 LWGMLAP 12 LWG+LAP Sbjct 17 LWGLLAP 23 67 8_G3 gi|71373270|gb|D013740 PHNTFSAYP 30 gi|24419041|gb|AAL65133.2| Score = 27.4 bits (57), Expect = 2.7 8.1| ECPDVTRT ovarian cancer related Id = 12/24 (50%), Positives = 15/24 (62%), Homo sapiens isolate TPMHTPHE tumor marker CA 125 Gaps = 6/24 (25%) UV122 mitochondrion, TSYHL gi168566146.sp.Q9NQW7|XP Query 2 complete genome P1_HUMAN HNTFSAYPE-----CPDVTRTTPM 20 Length = 16568 Xaa-Pro aminopeptidase 1 H+T SAYPE      P+VT T+ M Score = 496 bits (250) (Soluble aminopeptidase P) Sbjct 4847 Expect 2e−137 gi|32699681|sp|Q9H4B6|SAV HSTVSAYPEPSKVTSPNVT-TSTM 4869 Id = 250/250 (100%) 1_HUMAN Score = 22.3 bits (45), Expect = 91 Gaps = 0/250 (0%) Salvador hoinolog 1 protein Id 10/18 (55%), Positives = 11/18 (61%) Strand = Plus/Plus (45 kDa WW domain Gaps = 5/18 (27%) /gene = “COX1” protein) (hWW45) Query 13 /product = “cytochrome c gi|2230414|sp|O15067|PUR4 DVTRTTPMHTPH--ETSY 28 oxidase subunit 1” _HUMAN DVT T+P   P   ETSY Phosphoribosylformylglycin Sbjct 4440 amidine synthase (FGAM DVTWTSP---PSVAETSY 4454 synthase) Score = 20.6 bits (41), Expect = 296 gi|39793978|gb|AAH63538.1| Id = 9/16 (56%), Positives = 10/16 (62%), PFAS protein Gaps = 2/16 (12%) gi|8134719|sp|Q92966|SNPC Query 7 3_HUMAN AYPECPDVTRTTPMHT 22 snRNA activating protein AY E P V  T+PM T complex 50 kDa subunit Sbjct 6614 (Proximal sequence AYSEPPRV--TSPMVT 6627 element-binding Score = 26.9 bits (56), Expect = 4.9 transcription factor beta Id = 11/18 (61%), Positives = 12/18 (66%), subunit) Gaps = 6/18 (33%) (PSE-binding factor beta Query 13 subunit) (PTF beta subunit) DVTRTTPMH--TPHETSY 28 gi|55960781|emb|CAI16349.1 DVTRT  MH  TP  T+Y hepatoma-derived growth Sbjct 426 factor (high-mobility group DVTRT--MHFGTP--TAY 439 protein 1-like) Score = 26.1 bits (54), Expect = 11 Id = 16/36 (44%), Positives = 19/36 (52%), Gaps = 11/36 (30%) Query 3 PNSSPHNTFSAYPECPDV-TRT-----TPMH-TPHE 31 P+SSP N FS      DV +R      TP+  TPHE Sbjct 53 PDSSP-NAFST---SGDVVSRNQSFLRTPIQRTPHE 84 Score = 26.1 bits (54), Expect = 11 Id = 10/15 (66%), Positives = 11/15 (73%), Gaps = 1/15 (6%) Query 12 SAYPECPDVTRT-TP 25 SAY  CPD+T T TP Sbjct 826 SAYAVCPDITATVTP 840 Score = 17 .6 bits (34), Expect = 2315 Id = 9/21 (42%), Positives = 10/21 (47%), Gaps = 8/21 (38%) Query 10 ECPDVTRT-------TPMHTP 23 ECP V R        TP+ TP Sbjct 15 ECP-VRRNGQGDAPPTPLPTP 34 68 1_H7 gi|30584378|gb|BT00777 GRVIQE 36 gi|28892|emb|CAA35582.1| Score = 28.6 bits (60), Expect = 1.4 0.1| Synthetic construct PGGRGD unnamed protein product Identities = 12/22 (54%), Positives = 14/22 (63%) Homo sapiens cystatin B KLLHQG gi|2135396|pir||S71548 Gaps = 3/22 (13%) (stefin B) mRNA ARRRRG homeotic protein pG2- Query 15 Score = 280 bit(141) LRTPAS human LHQGARRRRGLRTPASVPISPS 36 Expect = 1e−72 VPISPS gi|48428276|sp|O43432|IF4G L QG RRR L    +SVP +PS Id = 141/141 (100%) 3_HUMAN Sbjct 227 Strand = Plus/Plus Eukaryotic translation LQQGGRRRGDL---SSVPTAPS 245 initiation factor 4 gamma 3 Score = 28.6 bits (60), Expect = 1.3 (elF-4-gamma 3) Identities = 12/22 (54%), Positives = 14/22 (63%), (eIF-4G 3) (eIF4G 3) (eIF-4- Gaps = 3/22 (13%) gamma II) (eIF4GII) Query: 15 gi|2190402|emb|CAA73944.1| LHQGARRRRGLRTPASVPISPS 36 latent TCF-beta binding L QG RRR  L   +SVP +PS protein-4 Sbjct: 228 LQQGGRRRGDL---SSVPTAPS 246 Score = 26.9 bits (56), Expect = 4.3 Identities = 15/28 (53%), Positives = 17/28 (60%) Gaps = 7/28 (25%) Query: 1 GRVIQEPGGRGDKLLHQGARR-------RR 23 GR  Q  PGGRG LL+ G+RR       RR Sbjct: 685 GR----QTPGGRGVPLLNVGSRRSQPGQRR 710 69 2_C8 gi|399924141|gb|BC06441 GSGKIKKSV 20 gi|56202484|emb|CA121939.1 Score = 24.0 bits (49), Expect = 29 8.1| Homo sapiens LWDRKVGI HBxAg transactivated Id = 9/13 (69%), Positives = 9/13 (69%), FK506 binding protein RKN protein 2 (XTP2) Gaps = 0/13 (0%) 9,63 kDa, mRNA gi|119611307|gb|EAW90901. Query 2 (cDNAcloneIMAGE: 575 1| BAT2 domain containing SGKIKKSVLWDRK 14 0487), partial cds 1, isoform CRA_d SG IKK VL D K Length = 2421 gi|29427863|sp|Q8WUM0|N Sbjct 83 Score = 486 bits (245) U133_HUMAN SGPIKKPVLRDMK 95 Expect = 1e−134 Nuclear pore complex Score = 24.0 bits (49), Expect = 41 Id = 259/264 (98%) protein Nup133 Id = 6/6 (100%), Positives = 6/6 (100%), Gaps = 0/264 (0%) gi|61216828|sp|Q96HF1|SFR Gaps = 0/6 (0%) Strand = Plus/Minus P2_HUMAN Query 8 Secreted apoptosis - related SVLWDR 13 protein 1 SVLWDR gi|12803735|gb|AAH02704.1 Sbjct 101 Signal transducer and SVLWDR 106 activator of transcription 1 Score = 23.1 bits (47), Expect = 73 gi|29337296|ref|NP_803881.1 Id = 6/6 (100%), Positives = 6/6 (100%), Melanoma antigen family D, Gaps = 0/6 (0%) 4 isoform 2 Query 6 gi|57012811|sp|Q7Z41I9|IBR KKSVLW 11 D2_HUMAN KKSVLW E3 ubiquitin ligase IBRDC2 Sbjct 233 (p53- inducible RING finger KKSVLW 238 protein) Phospholipase C beta 3 70 12_D gi|24430161|ref|NM_0003 TQLV 4 gi|38569484|ref|NP_060111.2 Score = 15.5 bits (29), Expect = 4278 04.2| Homo sapiens kinesin family member 21A Identities = 4/4 (100%), Positives = 4/4 (100%) peripheral myelin gi|21265037|ref|NP_055058.1 Gaps = 0/4 (0%) protein 22 (PMP22), ADAM metallopeptidase Query 1 transcript with throinbospondin type I TQLV 4 variant 1, mRNA motif, 3 proprotein TQLV Length = 1828 gi|57209716|emb|CA140838.1 Sbjct 78 Score = 357 bits (180), cadherin related 23 TQLV 81 Expect = 3e−96 gi|56204273|emb|CAI19274.1 Id = 208/219 (94%) retinoblastoina-associated Gaps = 1/219 (0%) factor 600 (RBAF600) Strand = Plus/Minus gi|55962143|emb|CAI15704.1 calsyntenin 1 gi|42525235|gb|AAS18317.1| CDK5 regulatory subunit associated protein 1 transcript variant 3 73 2_H3 gi|125541850|gb|EF40865 GVSVNEAS 16 gi|58202610|gb|AAW67356.1 Score = 27.8 bits (58), Expect = 2.9 6.1| Homo sapiens YDGKYSSY random intestinal-homing Id = 9/14 (64%), Positives = 10/14 (71%), haplotype H7 antibody heavy chain Gaps = 0/14 (0%) mitochondrion, variable region Query 3 complete genoine gi|60326824|gb|AAX18924.1| WVNEASYDGKYSSY 16 Length = 16569 anti-TARP (novel breast WV   SYDGKY +Y Score = 1199 bits (605), and prostate tumor- Sbjct 49 Expect = 0.0 associated antigen) WVAVISYDGKYENY 62 Id = 665/677 (98%) immnnoglobnlin heavy Score = 25.2 bits (52), Expect = 17 Gaps = 0/677 (0%) chain Id = 8/11 (72%), Positives 8/11 (72%), Strand = Plus/Minus gi|13161090|gb|AAK134791| Gaps = 0/11 (0%) /product = “NADH AF332227_1 heat shock Query 3 dehydrogenase subunit transcription factor 2-like WVNEASYDGKY 13 1” protein WV   SYDGKY gi|119602637|gb|EAW82231. Sbjct 47 1| eukaryotic translation WVAVISYDGKY 57 elongation factor 1 delta Score = 24.4 bits (50), Expect = 31 gi|10719919|sp|Q15326|ZMY Id = 7/8 (87%), Positives = 7/8 (87%), 11_HUMAN Gaps = 0/8 (0%) (Adenovirus 5 E1A-binding Query 2 protein) VSVNEASY 9 VSVNEA Y Sbjct 352 VSVNEAPY 359 75 1_G3 gi|71483115|ref|NM_0010 GAAQPR 28 gi|34526547|dbj|BAC85151.1| Score = 29.9 bits (63), Expect = 0.47 24.3| Homo sapiens NAERRR FLJ00336 protein Id = 10/13 (76%), Positives 10/13 (76%) ribosomal protein S21 RVRGPV gi|37590807|gb|AAH59399.1| Gaps = 1/13 (7%) (RPS21), mRNA RAAEML Solute carrier family 27 Query 7 Length = 418 R (fatty acid transporter) QP-RNAERRRVR 18 Score = 383 bits (193) gi|3139079|gb|AAC36682.1| QP R AERR RVR Expect = 6e−14 cullin 3 Sbjct 347 Id = 202/206 (98%) gi|51095155|gb|EAL24398.1| QPVREAERRHRVR 359 Gaps = 0/206 (0%) hypothetical protein Score = 29.5 bits (62), Expect = 0.63 Strand = Plus/Plus FLJ36031 Id = 11/18 (61%), Positives = 13/18 (72%) /product = “ribosomal gi|7018298|emb|0AB75612.1| Gaps = 1/18 (5%) protein S21” c394H11.1 (similar to SOX Query 11 (SRY (sex determining AERRRAVRGPVRAA-EML 27 region Y))) AERA R RG +R A +ML gi|22261792|sp|O60312|AT10 Sbjct 173 A_HUMAN AERRSRSRGAIRNACQML 190 Probable phospholipid- Score = 28.6 bits (60), Expect = 1.1 transporting ATPase VA Id = 12/17 (70%), Positives = 12/17 (70%) gi|14009443|dbj|BAB47392.1| Gaps = 2/17 (11%) putative aminophospholipid Query 8 translocase PRNAERRRRVRGPVRAA 24 gi|26986715|gb|AAN86723.1| PR A RRRR RG  RAA gi|54673563|gb|AAH37322.3| Sbjct 122 Transcription factor MLR1 PRAAARRRR-RG-ARAA 136 gi|1169723|sp|P41439|FOLR3 _HUMAN Folate receptor gamma precursor (FR-gamma) gi|55957252|emb|CA112668. pancreas specific transcription factor, 1a gi|11493712|gb|AAG35617.1| homeobox transcription factor gi|57242761|ref|NP_003795.2 receptor (TNFRSF)- interacting serine-threonine kinase 1 gi|60393639|sp|Q3546|RIPK 1_HUMANReceptor- interacting serine| threonine-protein kinase 2 (Cell death protein RIP) 76 11_B gi|18088464|gb|BC02116 NSSHH 5 gi|119616592|gb|EAW96186. Score = 19.3 bits (38), Expect = 532 10 7.1| Homo sapiens 1| killer cell lectin-like Id = 5/5(100%), Positives = 5/5(100%), prostaglandin E receptor subfamily C, Gaps = 0/5(0%) synthase 3 (cytosolic), member 3, isoform CRA_c Query 1 mRNA (cDNA clone gi|119618016|gb|EAW97610. NSSHH 5 IMAGE: 3683159), 1| apoptotic peptidase NSSHH partial cds activating factor Sbjct 137 Length = 1057 gi|119630109|gb|EAX09704.1 NSSHH 141 Score = 420 bits (212) dual-specificity tyrosine- Expect = 3e−115 (Y)-phosphorylation Id = 212/212(100%) regulated kinase Gaps = 0/2 12(0%) 1A, isoform CRA_c Strand = Plus/Plus gi|119580009|gb|EAW59605. 1| SW1/SNF related, matrix associated, actin dependent regulator of chromatin 77 5_C3 gi|98162806|ref|NM_0174 LGTSTAQQ 28 gi|119592169|gb|EAW71763. Score = 27.4 bits (57), Expect = 3.8 32.3| Homo sapiens HGGWCPEA 1| hCG1794614, isoform Id = 15/26 (57%), Positives = 16/26 (61%), prostate tumor SKDGPHPST gi|21750918|dbj|BAC03866.1| Gaps = 9/26 (34%) overexpressed gene 1 FPQ unnamed protein product Query 8 (PTOV1), mRNA gi|4877793|gb|AAD31434.1| QHGGWCPEASKD--GPHP---STFPQ 28 Length = 1884 DNA methyltransferase 3 QHGG CP  +K   GPHP   ST PQ Score = 206 bits (104), beta 5 Sbjct 13 Expect = 5e−51 gi|62898391|dbj|BAD97135.1 QHGG-CP--AKALPGPHPGVVST-PQ 34 Id = 104/104(100%), BCL2-like 14 isoform 1 Score 27.4 bits (57), Expect = 3.8 Gaps = 0/104(0%) gi|56748611|sp|Q9BZR8|B2L Id = 7/7 (100%), Positives = 7/7 (100%), Strand = Plus/Plus 14_HUMAN Apoptosis Gaps = 0/7 (0%) facilitator Bcl-2-like 14 Query 11 protein GWCPEAS 17 gi|118572673|sp|O14513|NA GWCPEAS P5_HUMAN Sbjct 131 Nck-associated protein 5 GWCPEAS 137 (Peripheral clock protein) Score = 25.2 bits (52), Expect = 16 gi|13633942|sp|Q9UM82|SPA Id = 8/10 (80%), Positives = 8/10 (80%), T2_HUMAN Gaps = 1/10 (10%) Spermatogenesis-associated Query 8 protein 2 QHGGWW-PEA 16 QHG WC PEA Sbjct 145 QHGPWCPPEA 154 79 11_C gi|18088638|gb|BC02088 DCSC 4 gi|119615430|gb|EAW95024. Score = 17.6 bits (34), Expect = 1378 9 9.1| Homo sapiens 1| EPH receptor B2 Id = 4/4 (100%), Positives = 4/4 (100%) cDNA clone gi|119599812|gb|EAW79406. Gaps = 0/4 (0%) IMAGE: 4692359 1| integrin, beta 5 Query 1 Length = 1436 gi|119583687|gb|EAW63283. DCSC 4 Score = 466 bits (235), 1| ADAM metallopetidase DCSC Expect = 4e−129 domain 9 (meltrin gamma) Sbjct 216 Id = 247/253 (97%), gi|115502238|sp|Q9Y2K2|QS DCSC 219 Gaps = 0/253 (0%) K_HUMANSerine|threonine -protein kinase QSK gi|119603107|gb|EAW82701. 1|phosphorylase kinase, beta, isoforin CRA_b 80 5_H4 gi|34304330|ref|NM_1830 NRSRWICG 17 gi|13124092|sp|Q9UBT3|DK Score = 24.0 bits (49), Expect = 29 63.1| PLGLIKALV K4_HUMAN Dikkopf- Id = 8/13 (61%), Positives = 9/13 (69%) Homo sapiens ring related protein 4 precursor Gaps = 3/13 (23%) finger protein 7 (RNF7), (Dkk-4) (24) Query 5 transcript variant 2, gi|51477372|ref|XP_293360.4 WICGPLGLIKALV 17 mRNA PREDICTED: Similar to W+C PLG   ALV Length = 1880 Ubiquitin ligase SIAH1 Sbjct 11 Score = 708 bits (357) Seven in absentia homolog WLCSPLG---ALV 20 Expect = 0.0 1-like Score = 22.3 bits (45), Expect = 95 Id = 365/369 (98%) gi|48146981|emb|CAG3 Id = 7/9 (77%), Positives = 8/9 (88%) Gaps = 0/369 (0%) 3713.1| Gaps = 0/9 (0%) Strand = Plus/Plus NOV [Homo sapiens] Query 8 gi|1352515|sp|P48745|NOV_ GPLGLIKAL 16 HUMAN NOV protein GPLGLI+ L homolog precursor (NovH) Sbjct 72 (Nephroblastoma GPLGLIRNL 80 overexpressed gene protein Score = 22.3 bits (45), Expect = 129 homolog) Id = 5/5 (100%), Positives = 5/5 (100%), gi|55925626|ref|NP_0010072 Gaps = 0/5 (0%) 54.1| endogenous retroviral Query 5 sequence 3 WICGP 9 gi|44887883|sp|Q14264|ENR WICGP 1_HUMAN HERV- Sbjct 419 R_7q21.2 provirus ancestral WICGP 423 Env polyprotein precursor (Envelope polyprotein) 83 2_H7 gi|4506788|ref|NM_00297 RKGKKSKR 14 gi|8924242|ref|NP_061136.1| Score = 25.7 bits (53), Expect = 9.1 0.1| RKWLNS sarcoma antigen 1 Id = 10/20 (50%), Positives = 12/20 (60%), Homo sapiens gi|8216987|emb|CAB92443.1| Gaps = 7/20 (35%) spermidine/spermine putative tumor antigen Query 1 N1-acetyltransferase gi|51458801|ref|XP_089384.7 RKGKK---SKRRK----WLN 13 (SAT), mRNA | PREDICTED: similar to RK K+   SKRRK    WL+ Length = 1060 RIKEN cDNA A430025D11 Sbjct 51 Score = 1031 bits (520) gi|23396625|sp|Q9NQT8|K11 RKSKRHSSSKRRKSMSSWLD 70 Expect = 0.0 3B_HUMAN Kinesin-like Score = 25.2 bits (52), Expect = 12 Id = 520/520 (100%) protein K1F13B Id = 7/9 (77%), Positives = 8/9 (88%), Gaps = 0/520 (0%) gi|57013078|sp|Q7Z699|SPR Gaps = 0/9 (0%) Strand = Plus/Plus E1_HUMAN Sprouty- Query 4 related, EVH1 domain KKSKRRKWL 12 containing protein 1 KK K+RKWL gi|31565770|gb|AAH53600.1| Sbjct 22 Transmembrane and coiled- KKKKKRKWL 30 coil domains 4 Score = 24.4 bits (50), Expect = 30 gi|56202682|emb|CA120092.1 Id = 7/7 (100%), Positives = 7/7 (100%), novel protein Gaps = 0/7 (0%) gi|7227890|sp|O24175|FL_O Query 4 RYSA Putative KKSKRRK 10 transcription factor FL KKSKRRK gi|51701343|sp|Q8TD10|CHD Sbjct 60 5_HUMAN Chromodomain KKSKRRK 66 helicase-DNA-binding Score = 24.8 bits (51), Expect = 22 protein 5(CHD-5) Id = 6/6 (100%), Positives = 6/6 (100%) gi|126178|sp|P14151|LYAM1 Gaps = 0/6 (0%) _HUMAN L-selcctin Query 8 precursor (Lymph node RRKWLN 13 homing receptor) RRKWLN (Leukocyte adhesion Sbjct 1092 molecule 1) (gp90-MEL) RRKWLN 1097 gi|18378735|ref|NP_055625.2 Score = 24.4 bits (50), Expect = 22 centrosome-associated Id = 8/11 (72%), Positives = 8/11 (72%), protein 350 Gaps = 1/11 (9%) Query 1 RKGKKSKRRKW 11 RK KK  RRKW Sbjct 178 RKKKENRRKW 187 85 9_C3 gi|92444107|gb|BC01388 GDPNSS 6 gi|119615660|gb|EAW95254. Score 18.5 bits (36), Expect = 1148 7.1| Homo sapiens 1| cytoplasmic linker Id = 5/5 (100%), Positives = 5/5 (100%), mRNA similar to mouse associated Gaps = 0/5 (0%) double minute 2, human gi|119607399|gb|EAW86993. Query 1 homolog of; p53-binding 1| telomeric repeat binding GDPNS 5 protein (cDNA clone factor (NIMA-interacting) 1 GDPNS IMAGE: 3841679) gi|119588279|gb|EAW67873. Sbjct 177 Length = 3865 1| protein tyrosine GDPNS 181 Score = 835 bits (421) phosphatase, receptor typeJ Score = 18.5 bits (36), Expect = 1148 Expect = 0.0 gi|119626399|gb|EAX05994.1 Id = 5/5 (100%), Positives = 5/5 (100%), Id = 423/424 (99%) dentin sialophosphoprotein Gaps = 0/5 (0%) Gaps = 0/424 (0%) gi|119613825|gb|EAW93419. Query 2 Strand = Plus/Plus 1| TNF receptor-associated DPNSS 6 factor 5 DPNSS gi|116242829|sp|Q9Y4A5|TR Sbjct 2031 RAP_HUMANTransformation| DPNSS 2035 transcription domnain- associated protein 87 2_C2 gi|30584378|gb|BT00777 GRVIQE 36 gi|28892|emb|CAA35582.1| Score = 28.6 bits (60), Expect = 1.4 0.1| Synthetic construct PGGRGD unnamed protein product Identities = 12/22 (54%), Positives = 14/22 (63%), Homo sapiens cystatin B KLLHQC gi|2135396|pir||S71548 Gaps = 3/22 (13%) (stefin B) mnRNA ARRRRG homeotic protein pG2 Query 15 Score = 280 bit(141) LRTPAS gi|48428276|sp|O43432|IF4G LHQGARRRRGLRTPASVPISPS 36 Expect = 1e−72 VPISPS 3_HUMAN L QG RRR L   +SVP +PS Id = 141/141 (100%) Eukaryotic translation Sbjct 227 Strand = Plus/Plus initiation factor 4 gamma 3 LQQGGRRRGDL--SSVPTAPS 245 (eIF-4-gamma II) (eIF4GII) Score = 28.6 bits (60), Expect = 1.3 gi|2190402|emb|CAA73944.1| Identities = 12/22 (54%), Positives = 14/22 (63%), latent TGF-beta binding Gaps = 3/22 (13%) protein-4 Query: 15 LHQGARRRRGLRTPASVPISPS 36 L QG RRR  L   +SVP +PS Sbjct: 228 LQQGGRRRGDL---SSVPTAPS 246 Score = 26.9 bits (56), Expect = 4.3 Identities 15/26 (53%) Positives = 17/28 (60%), Gaps = 7/28 (25%) Query: 1 GRVIQEPGGRGDKLLHQGARR-----RR 23 GR  Q PGGRG  LL+ G+RR     RR Sbjct: 685 GR--QTPGGRGVPLLNVGSRRSQPGQRR 710 90 10_F gi|125656973|gb|EF06115 VGNGEG 11 gi|20177960|sp|Q96JB6|LOX Score = 24.8 bits (51), Expect = 21 11 0.1| RLEVL L4_HUMAN Id = 7/7 (100%), Positives = 7/7 (100%), Homo sapiens isolate Lysyl oxidase homolog 4 Gaps = 0/7 (0%) UV0758 mitochondrion, gi|4826914|ref|NP_005081.1| Query 11 complete genome phospholipase A2, group IVB EGRLEVL 17 Length = 16573 gi|3811347|gb|AAC78836.1| EGRLEVL Score = 208 bits (105) cytosolic phospholipase A2 Sbjct 43 Expect = 3e−51 gi|2745961|gb|AAB94793.1| EGRLEVL 49 Id = 112/115 (97%) Bcd orf2 Score = 24.4 bits (50), Expect = 22 Gaps = 0/115 (0%) gi|5453978|ref|NP_006250.1| Id = 7/7 (100%), Positives = 7/7 (100%), Strand = Plus/Minus protein kinase, cGMP- Gaps = 0/7 (0%) /product = “NADH dependent type II Query 4 dehydrogenase subunit gi|51473081|ref|XP_372641.2 GEGRLEV 10 1” Spermatogenesis associated GEGRLEV protein PD1 Sbjct 349 gi157282312|emb|CAD43180. GEGRLEV 355 3| interleukin-1 receptor Score = 22.3 bits (45), Expect = 97 associated kinase-2 Id = 6/8 (75%), Positives = 8/8 (100%), gi|20177834|sp|O43866|CD5 Gaps = 0/8 (0%) L_HUMAN CD5 antigen- Query 4 like precursor (SP-alpha) GEGRLEVL 11 (IgM-associated peptide) G+GRLE+L Sbjct 152 GQGRLEIL 159 93 2_G2 gi|6706619|emb|AJ25197 GSRVRMSG 14 gi|11322769|emb|CAC16957. Score = 26.5 bits (55), Expect = 7.1 3.1|HSA251973 KKKERK gap junction protein, alpha Id = 8/10 (80%), Positives = 8/10 (80%), Homo sapiens partial 3, 46 kDa (connexin 46) Gaps = 0/10 (0%) steerin-1 gene gi|119607339|gb|EAW86933. Query 4 Length = 200033 1| centrosome and spindle VRMSGKKKER 13 Features in this part of pole associated protein 1, VRM  KKKER subject sequence: gi|119578900|gb|EAW58496. Sbjct 100 Steerin-1 protein nucleolar protein) family VRMEEKKKER 109 Score = 662 bits (334) 6 (RNA-associated), isoform Score = 24.4 bits (50), Expect = 31 Expect = 0.0 gi|20141241|sp|P50454|SERP Id = 7/8 (87%), Positives = 8/8 (100%), Id = 370/380 (97%) H_HUMANSerpin H1 Gaps = 0/8 (0%) Gaps = 2/380 (0%) precursor (Collagen- Query 6 Strand = Plus/Minus binding protein) MSGKKKER 13 (Proliferation-inducing gene +SGKKKER 14 protein) sbjct 5 gi|30583027|gb|AAP35758.1| LSGKKKER 12 serine (or cysteine) Score = 24.0 bits (49), Expect = 41 proteinase inhibitor, clade Id = 7/10 (70%), Positives = 9/10 (90%), H (heat shock protein 47), Gaps = 0/10 (0%) member 2 Query 4 VRMSGKKKER 13 VR+S KKK+R Sbjct 95 VRLSEKKKDR 104 94 10_F gi|21411332|gb|BC03101 NSSVS 5 gi|119626680|gb|EAX06275.1 Score = 17.2 bits (33), Expect = 2312 8 2.1| alpha-kinase 1, isoform Id = 5/5 (100%), Positives = 5/5 (100%), Homo sapiens gi|119624697|gb|EAX04292.1 Gaps 0/5 (0%) eukaryotic translation ectonucleotide Query 1 elongation factor 1 pyrophosphatase/phospho- NSSVS 5 gamma, diesterase 4 NSSVS mRNA (cDNA clone gi|119597572|gb|EAW77166. Sbjct 3340 MGC: 32765 1| AT hook containing NSSVS 3344 IMAGE: 4654721), transcription factor 1 complete cds gi|119583836|gb|EAW63432. Length = 1441 1| testis expressed sequence Score = 965 bits (487) 15, isoform CRA_a Expect = 0.0 gi|118572823|sp|Q96QP1|AL Id = 529/550 (96%) PK1_HUMAN Lymphocyte Gaps = 0/550 (0%) alpha-protein kinase Strand = Plus/Plus gi|91208166|sp|Q96Q15|5MG 1_HUMANSerine/threonine -protein kinase SMG1 95 10_H gi|33869450|gb|BC01719 SICA 4 gi|119621016|gb|EAX00611.1 Score = 15.9 bits (30), Expect = 4455 3 4.2| solute carrier family 30 Id = 4/4 (100%), Positives = 4/4 (100%), Homo sapiens E74-like (zinc transporter),member 3 Gaps = 0/4 (0%) factor 4 (ets domain gi|119629012|gb|EAXO8607. Query 1 transcription factor), FRAS1 related extracellular SICA 4 mRNA (cDNA clone matrix protein 2 SICA MGC: 1755 gci|119590121|gb|EAW69715. Sbjct 741 IMAGE: 3138355), 1| nuclear VCP-like SICA 744 complete cds gi|110282976|sp|P46020|KPB Length = 4121 1_HUMAN Score = 904 bits (456), Phosphorylase b kinase Expect = 0.0 regulatory subunit alpha, Id = 464/467 (99%) skeletal muscle isoform Gaps = 0/467 (0%) gi|62087778|dbj|BAD92336. Strand = Plus/Plus rearranged L-myc fusion sequence variant 96 11_C gi|109148511|ref|NR_003 QLRISTTRS 11 gi|119628939|gb|EAX08534.1 Score = 24.0 bits (49), Expect = 41 1 089.1| Homo sapiens WT hCG2042156 Id = 7/10 (70%), Positives = 8/10 (80%), OTU domain, ubiquitin gi|116496611|eb|AA126137.1| Gaps = 0/10 (0%) aldehyde binding 1 DNASE2B protein Query 1 (OTUB1), transcript gi|46395921|sp|Q8WZ79|DN QLRISTTRSW 10 variant 2, transcribed S2B HUMAN QLR ST R+W RNA Deoxyribonuclease-2-beta Sbjct 88 Length = 2518 gi|17066106|emb|CAD12457. QLRDSTARAW 97 Score = 829 bits (418) 1| Novex-3 Titin Isoform Score = 23.1 bits (47), Expect = 75 Expect = 0.0 gi|119610438|gb|EAW90032. Id = 6/6 (100%), Positives = 6/6 (100%), Id = 425/428 (99%) 1| netrin 1 Gaps = 0/6 (0%) Gaps = 0/428 (0%) gi|119582311|gb|EAW61907. Query 5 Strand = Plus/Plus 1| centaurin, delta 3, STTRSW 10 gi|47779175|gb|AAT38470.1| STTRSW immunoglobulin heavy Sbjct 68 chain STTRSW 73 gi|5456914|gb|AAD43707.1| Score = 22.3 bits (45), Expect = 134 protocadherin alpha 5 Id = 6/8 (75%), Positives = 7/8 (87%), gi|4502681|ref|NP_001772.1| Gaps = 0/8 (0%) CD69 antigen (p60, early T- Query 3 cell activation antigen) RISTTRSW 10 gi|584906|sp|Q07108|CD69_ RIS+T SW HUMAN Early activation Sbjct 3491 antigen CD69 RISSTSSW 3498 (Early T-cell activation Score = 16.8 bits (32), Expect = 6145 antigen p60) Id = 6/9 (66%), Positives = 7/9 (77%), Gaps = 2/9 (22%) Query 3 RISTT--RS 9 RIST+  RS Sbjct 312 RISTSPIRS 320 98 2_C3 gi|30584378|gb|BT00777 GRVIQE 36 gi|28892|emb|CAA35582.1| Score = 28.6 bits (60), Expect = 1.4 0.1| Synthetic construct PGGRGD unnamed protein product Identities = 12/22 (54%), Positives = 14/22 (63%), Homo sapiens cystatin B KLLHQG gi|2135396|pir||S71548 Gaps = 3/22 (13%) (stefin B) mRNA ARRRRG homeotic protein pG2- Query 15 Score = 280 bit(141) LRTPAS human LHQGARRRRGLRTPASVPISPS 36 Expect = 1e−72 VPISPS gi|48428276|sp|O43432|IF4G L QG RRR  L   +SVP +PS Id = 141/141 (100%) 3_HUMAN Sbjct 227 Strand = Plus/Plus Eukaryotic translation LQQGGRRRGDL---SSVPTAPS 245 initiation factor 4 gamma 3 Score = 28.6 bits (60), Expect = 1,3 (eIF-4-gamma II) Identities = 12/22 (54%), Positives = 14/22 (63%), gi|2190402|emb|CAA73944.1| Gaps = 3/22 (13%) latent TGF-beta binding Query: 15 protein-4 LHQGARRRRGLRTPASVPISPS 36 L QG RRR  L   +SVP +PS Sbjct: 228 LQQGGRRRGDL---SSVPTAPS 246 Score = 26.9 bits (56), Expect = 4.3 Identities = 15/28 (53%), Positives = 17/28 (60%), Gaps = 7/28 (25%) Query: 1 GRVIQEPGGRGDKLLHQGARR-----RR 23 GR  Q PGGRG LL+G+RR      RR Sbjct: 685 GR--QTPGGRGVPLLNVGSRRSQPGQRR 710 10 8_D4 gi|76779425|gb|BC10604 DMSYK 5 gi|59798440|sp|Q9UNA4|PO Score = 18.5 bits (36), Expect = 933  6 6.1| L1_HUMAN Id = 4/5 (80%), Positives = 5/5 (100%), Homo sapiens pre-B-cell DNA polymerase iota Gaps = 0/5 (0%) colony enhancing factor (RAD30 homolog B) (Eta2) Query 1 1, transcript variant 1, gi|56757695|sp|Q9ULC6|PA DMSYK 5 mRNA (cDNA clone DI1_HUMANProtein- +MSYK MGC: 117256 arginine deiminase type I Sbjct 104 IMAGE: 6161081), (Peptidylarginine deiminase EMSYK 108 complete cds I) Score = 18.0 bits (35), Expect = 914 Length = 2144 gi|3122589|sp|O15355|PP2C Id = 4/4 (100%), Positives = 4/4 (100%), Score = 256 bits (129) G_HUMAN Protein Gaps = 0/4 (0%) Expect = 3e−65 phosphatase 2C gamma Query 1 Id = 144/151 (95%) isoform DMSY 4 Gaps = 0/151 (0%) gi|68067549|sp|P07858|CAT DMSY Strand = Plus/Plus B_HUMAN Cathepsin B Sbjct 162 precursor (Cathepsin B1) DMSY 165 gi|12585368|sp|QI6890|TPD5 3_HUMAN Tumor protein D53 (hD53) gi|52000845|sp|Q9UL12|SAR H_HUMAN Sarcosine dehydrogenase, mitochondrial precursor (SarDH) (BPR-2) gi|1711371|sp|P54764|EPHA4 _HUMAN Ephrin type-A receptor 4 precursor (Tyrosine-protein kinase receptor SEK) gi|17380181|sp|O60508|PR17 HUMAN Pre-mRNA splicing factor PRP17 (hPRP17) (Cell division cycle 40 homolog) (EH-binding protein 3) gi|62087974|dbj|BAD92434. protein phosphatase 1G variant 10 1_D6 gi|20357564|ref|NM_0001 RGDKLLHQ 27 gi|28892|emb|CAA355821| Score = 28.6 bits (60), Expect = 1.6  8 00.2| Homo sapiens GARRRRGL unnamed protein product Id = 12/22 (54%), Positives = 14/22 (63%), cystatin B (stefin B) RTPASVPIS gi|2135396|pir||S71548 Gaps = 3/22 (13%) (CSTB), mRNA PS homeotic protein pG2- Query 6 Length = 674 human LHQGARRRRGLRTPASVPISPS 27 Score = 232 bits (117) gi|38648796|gb|AAH63316.1| L QG RRR  L   +SVP +PS Expect = 9e−59 FBXL17 protein Sbjct 228 Id = 117/117 (100%) LQQGGRRRGDL---SSVPTAPS 246 Gaps = 0/117 (0%) Score = 28.2 bits (59), Expect = 2.1 Strand = Plus/Plus Id = 11/18 (61%), Positives = 11/18 (61%), Gaps = 5/18 (27%) Query 11 RRRRGL-----RTPASVP 23 RRRR L     RTPA VP Sbjct 25 RRRRPLLRLPRRTPAKVP 42 10 4_G4 gi|56788032|gb|AY69246 RPARSR 14 gi|57163116|emb|CA139857.1 Score = 26.5 bits (55), Expect = 5.0  9 4.1| RMMAW protein (peptidyl-prolyl Id = 7/8 (87%), Positives = 7/8 (87%), Homo sapiens growth- GKA cis/trans isomerase) NIMA- Gaps = 0/8 (0%) inhibiting gene 46 interacting, 4 (parvulin) Query 1 mRNA, complete cds gi|32171215|ref|NP_859066.1 RPARSRRM 8 Length = 1715 transducer of regulated RP RSRRM Score = 825 bits (416) cAMP response element- Sbjct 14 Expect = 0.0 binding protein (CREB) 2 RPERSRRM 21 Id = 416/416(100%) gi|59800455|sp|Q9UPY6|WA Score = 23.5 bits (48), Expect = 40 Gaps = 0/4 16(0%) SF3_HUMAN Id = 6/7 (85%), Positives = 6/7 (85%), Strand = Plus/Minus Wiskott-Aldrich syndrome Gaps = 0/7 (0%) protein family member 3 Query 6 (WASP-family protein RRMMAWG 12 member 3) RR MAWG gi|4927214|gb|AAD33054.1| Sbjct = 144 Scar3 RRTMAWG 150 gi|57013057|sp|Q6STE5|SMR Score = 23.1 bits (47), Expect = 53 D3_HUMAN Id = 8/14 (57%), Positives = 9/14 (64%), SW1/SNF-related matrix- Gaps = 4/14 (28%) associated actin-dependent Query 1 regulator of chromatin RPARSRR----MMA 10 subfamily D member 3 R AR+RR    MMA gi|19421557|gb|AAK56405.1 Sbjct 149 chromodomain helicase RKARNRRQEWNMMA 162 DNA binding protein 5 gi|1399745|gb|AAB08988.1| myelodysplasia/myeloid leukemia factor 2 gi|73921220|sp|Q86WB0|NIP A_HUMAN Nuclear- interacting partner of anaplastic lymphoma kinase) (hNIPA) gi|41019490|sp|P49736|MCM 2_HUMAN DNA replication licensing factor MCM2 (Nuclear protein DM28) gi|1706888|sp|P53539|FOSB_(—) HUMAN Protein fosB (G0/G1 switch regulatory protein 3) 11 10_E gi|123998528|gb|D08959 AETVGP 29 gi|113422952|ref|XP_001129 Score = 26.5 bits (55), Expect = 6.8  2 8 40.2| GREEGC 208.1| PREDICTED: Id = 11/16 (68%), Positives = 11/16 (68%), Synthetic construct WQRGRP hypothetical protein Gaps = 5/16 (31%) Homo sapiens clone NEETTC gi|34785506|gb|AAH57760.1| Query 5 IMAGE: 3938260; PSSRS MORN repeat containing 3 GPGREEGC--W-QRGR 17 FLH191546.01L; gi|119622706|gb|EAX02301.1 GPGR  GC  W QRGR RZPDo839E0667D proprotein couvertase sbjct 101 ribosomal protein L7a subtilisin/kexin type 6, GPGR--GCGGWVQRGR 114 (RPL7A) gene, encodes isoformn CRA_a Score = 25.2 bits (52), Expect = 16 complete protein Id = 6/7 (85%), Positives = 7/7 (100%), Length = 841 Gaps = 0/7 (0%) Score = 678 bits (342) Query 10 Expect = 0.0 EGCWQRG 16 Id = 342/342 (100%) EGCW+RG Gaps = 0/342 (0%) Sbjct 161 Strand = Plus/Plus EGCWERG 167 Score = 24.8 bits (51), Expect = 22 Id = 7/7 (100%), Positives = 7/7 (100%), Gaps = 0/7 (0%) Query 4 VGPGREE 10 VGPGREE Sbjct 620 VGPGREE 626 11 8_H1 gi|114520617|ref|NM_021 PIDGLATSA 94 gi|119623272|gb|EAX02867.1 Score = 65.1 bits (146), Expect = 1e−10  4 0 130.3| Homo sapiens IMACEVTT hCG1792883, isoform Id = 23/35 (65%), Positives = 27/35 (77%), peptidylprolyl isomerase LTHKPWNN gi|119623270|gb|EAX02865.1 Gaps = 1/35 (2%) A (cyclophilin A) SVKAGTLIT hCG1792883, isoform Query41 (PPIA), mRNA KSFLSSAQS gi|119624576|gb|EAX04171.1 AQSTKIFCCLWDLVCKQLKGDAAQGLAVDGNVKEQ 75 Length = 2276 TKIFCCLW solute carrier family 22 AQS KIFCCLW+ V KQL  DAAQGL + G+V+E Score = 436 bits (220) DLVCKQLK gi|119577790|gb|EAW57386. Sbjct56 Expect = 5e−120 GDAAQGLA 1| dystrophia myotonica- AQSMKIFCCLWNFVYKQL-EDAAQGLTMGGDVEEH 89 Id = 223/224 (99%) VDGNVKEQ containing WD repeat Score = 58.3 bits (130), Expect = 1e−08 Gaps = 0/224 (0%) SIHKLHNTA motif, isoform CRA_a Id = 20/31 (64%), Positives = 24/31 (77%), Strand = Plus/Minus RXSLRPHSS gi|119570084|gb|EAW49699. Gaps = 1/31 (3%) N 1| F-box and leucine-rich Query 45 repeat protein 15 KIFCCLWDLVCKQLKGDAAQGLAVDGNVKEQ 75 gi|119570183|gb|EAW49798. KIFCCLW+ V KQL  DAAQGL + G+V+E 1| mitochondrial ribosomal Sbjct 2 protein L43, isoform CRA_c KIFCCLWNFVYKQL-EDAAQGLTMGGDVEEH 31 gi|32165518|gb|AAP72126.1| Score = 27.4 bits (57), Expect = 25 G protein-coupled receptor Id = 10/15 (66%), Positives = 11/15 (73%), 120 Gaps = 4/15 (26%) Query 49 CL---WDLVCKQLKG 60 CL   WDLVC+Q KG Sbjct 136 CLSLQWDLVCEQ-KG 149 11 6_G8 gi|30584378|gb|BT00777 GRVIQE 36 gi|28892|emb|CAA35582.1| Score = 28.6 bits (60), Expect = 1.4  5 0.1| Synthetic construct PGGRGD unnamed protein product Identities = 12/22 (54%), Positives = 14/22 (63%), Homo sapiens cystatin B KLLHQC gi|2135396|pir||S71548 Gaps = 3/22 (13%) (stefin B) mRNA ARRRRG hoemotic protein pG2 Query 15 Score = 280 bit(141) LRTPAS gi|48428276|sp|O43432|IF4G LHQGARRRRGLRTPASVPISPS 36 Expect = 1e−72 VPISPS 3_HUMAN L QG RRR L   +SVP +PS Id = 141/141 (100%) Eukaryotic translation Sbjct 227 Strand = Plus/Plus initiation factor 4 gamma 3 LQQGGRRRGDL---SSVPTAPS 245 (eIF-4-gamnma 3) Score = 28.6 bits (60), Expect 1.3 (eIF-4-gamma II) Identities = 12/22 (54%), Positives = 14/22 (63%), gi|2190402|emb|CAA73944.1| Gaps = 3/22 (13%) latent TGF-beta binding Query: 15 protein-4 LHQGARRRRGLRTPASVPISPS 36 L QG RRR  L   +SVP +PS Sbjct: 228 LQQGGRRRGDL---SSVPTAPS 246 Score = 26.9 bits (56), Expect = 4.3 Identities = 15/28 (53%), Positives = 17/28 (60%), Gaps = 7/28 (25%) Query: 1 GRVIQEPGGRGDKLLHQGARR-----RR 23 GR  Q PGGRG  LL+ G+RR     RR Sbjct: 685 GR--QTPGGRGVPLLNVGSRRSQPGQRR 710 11 1_D4 gi|20357564|ref|NM_0001 GRVIQE 34 gi|28892|emb|CAA35582.1| Score = 28.6 bits (60), Expect = 1.4  7 00.2| Homo sapiens PGGRGD unnamed protein product Identities = 12/22 (54%), Positives = 14/22 (63%), cystatin B (stefin B) KLLHQG gi|2135396|pir||S71548 Gaps = 3/22 (13%) (CSTB) cystatin B (liver ARRRRG homeotic protein pG2 Query 15 thiol proteinase LRIPAS gi|48428276|sp|O43432|IF4G LHQGARRRRGLRTPASVPISPS 36 inhibitor), mRNA VPISPS 3_HUMAN L QG RRR L    +SVP +PS Length = 674 Eukaryotic translation Sbjct 227 Score = 412 bits (208) initiation factor 4 gamma 3 LQQGGRRRGDL---SSVPTAPS 245 Expect = 8e−113 (eIF-4-gamma 3) Score = 28.6 bits (60), Expect = 1.3 Id = 208/208 (100%) (eIF-4-gamma II) Identities = 12/22 (54%), Positives = 14/22 (63%), Gaps = 0/208 (0%) gi|2190402|emb|CAA73944.1| Gaps = 3/22 (13%) Strand = Plus/Plus latent TGF-beta binding Query: 15 protein-4 LHQGARRRRGLRTPASVPISPS 36 L QG RRR  L   +SVP +PS Sbjct: 228 LQQGGRRRGDL---SSVPTAPS 246 Score = 26.9 bits (56), Expect = 4.3 Identities = 15/28 (53%), Positives = 17/28 (60%), Gaps = 7/28 (25%) Query: 1 GRVIQEPGGRGDKLLHQGARR-----RR 23 GR  Q PGGRG  LL+ G+RR     RR Sbjct: 685 GR--QTPGGRGVPLLNVGSRRSQPGQRR 710 11 6_C4 gi|71559137|ref|NM_0010 SSQGHF 6 gi|21753429|dbj|BAC04343.1 Score = 21.8 bits (44), Expect = 78  8 29.3| Unnamed protein product Id = 6/6 (100%), Pos = 6/6 (100%), Homo sapiens ribosomal gi|12643872|sp|Q9UBS0|KS6 Gaps = 0/6 (0%) protein S26 (RPS26), B2_HUMAN Query 1 mRNA Ribosomal protein S6 SSQGHF 6 Length = 699 kinase 2 (S6K2) SSQGHF Score = 367 bits (185) (70 kDa ribosomal protein) Sbjct 74 Expect = 1e−98 (S6 kinase-related kinase) SSQGHF 79 Id = 202/210 (96%) (Serine/threonine-protein Score = 18.0 bits (35), Expect = 1097 Gaps = 0/210 (0%) kinase 14 beta) Id = 5/5 (100%), Pos = 5/5 (100%), Strand = Plus/Plus Colony stimulating factor 1 Gaps = 0/5 (0%) Roquin (RING finger and Query 1 C3H zinc finger protein 1) SSQGH 5 G protein coupled receptor SSQGH 123 Sbjct 189 Solute carrier family 44, SSQGH 193 member 5 Choline transporter-like protien 5 12 1_D8 gi|123997386|gb|DQ8953 LPRVQAQG 64 gi|48429165|sp|Q86V81|THO Score = 36.7 bits (79), Expect = 0.016  1 69.2| GRVPEETE C4_HUMAN Id = 19/37 (51%), Positives = 21/37 (56%), Synthetic construct GAGGGRGR THO complex subunit 4 Gaps = 14/37 (37%) Homo sapiens clone QGRAGAPA (Tho4) (Ally of AML-1 and Query8 IMAGE: 3505011; GRGTAAAQ LEF-1) (Transcriptional GGRVPEETEGAGGGRGRQGRAGAPAGRGTAAAQGGAE 44 FLH183687.01L; GGAELGAE coactivator Aly/REF) (bZIP GGR        GGGRGR GRAG+  GRG     GGA+ RZPDo839F06141D AGGDAQEG enhancing factor BEF) SbJct21 CDC37 cell division SLRPHSSN gi|47117879|sp|P83369|LSM1 GGR--------GGGRGR-GRAGSQGGRG-----GGAQ 43 cycle 37 homolog (S. 1_HUMAN Score = 21.0 bits (42), Expect = 848 cerevisiae) (CDC37) U7 snRNA-associated Sm- Id = 10/16 (62%), Positives = 10/16 (62%), gene, encodes complete like protein LSm11 Gaps = 4/16 (25%) protein gi|30802096|gb|AAH51353.1| Query 19 Length = 1177 LSM11 protein GGGRGRQGRAGAPAGR 34 Score = 333 bits (168) GG RGR GR    AGR Expect = 2e−88 Sbjct 221 Id = 168/168 (100%) GGARCR-GRG---AGR 232 Gaps = 0/168 (0%) Score = 35.0 bits (75), Expect = 0.052 Strand = Plus/Plus Id = 17/24 (70%), Positives = 17/24 (70%), Gaps = 2/24 (8%) Query 19 GGGRGRQGRA-GAPAGRGTAAAQG 41 GGGRGR GRA GA AG G  AA G Sbjct 69 GGGRGR-GRARGAAAGSGVPAAPG 91 12 5_C6 gi|75992939|ref|NM_0046 VVSPPSSAR 22 gi|119592058|gb|EAW71652 Score = 29.1 bits (61), Expect = 1.2  2 51.3| Homo sapiens PACVCPSSS 1| hCG2006056, isoform Id = 11/17 (64%), Positives = 12/17 (70%), ubiquitin specific DPPF CRA_d Gaps = 4/17 (23%) peptidase 11 (USP11), gi|119618258|gb|EAW97852. Query 5 mRNA 1| acetyl-Coenzyme A PSSARPACVCPSS--SD 19 Length = 3300 carboxylase beta, isoform PS  RP+CVCP S  SD Score = 446 bits (225) gi|14589876|ref|NP_115835.1 Sbjct 66 Expect = 5e−123 embryonal Fyn-associated PS--RPSCVCPCSARSD 80 Id = 225/225(100%) substrate isoform 2; Efs2 -- Score = 27.4 bits (57), Expect = 3.8 Gaps = 0/225(0%) gi|29336893|sp|Q96DN6|MB Id = 10/19 (52%), Positives = 11/19 (57%), Strand = Plus/Plus D6_HUMAN Methyl-CpG- Gaps = 8/19 (42%) binding domain protein 6 Query 4 gi|51466429|ref|XP_380018.2 PPSSARPACVCPSSSDPPF 22 similar to Ankyrin repeat PPSSARPA        PP+ and IBR domain-containing Sbjct 254 protein 1; ANK1B1 protein PPSSARPA--------PPY 264 gi|88943889|sp|Q70EK8|UBP Score = 26.5 bits (55), Expect = 6.9 53_HUMAN Id = 12/21 (57%), Positives = 13/21 (61%), Inactive ubiquitin carboxyl- Gaps = 6/21 (28%) terminal hydrolase 53 Query 1 VVSPPSSARPACVCPSSSDPP 21 VV PP  ARP   CP+S  PP Sbjct 9 VVPPP--ARP---CPTSC-PP 23 12 6_F4 gi|17426141|gb|AF32532 NSKE 4 gi|119626370|gb|EAX05965.1 Score = 18.0 bits (35), Expect = 1284  4 6.1|F325326S0| Homo protein tyrosine Id = 5/5 (100%), Positives = 5/5 (100%), sapiens macrophin 1 phosphatase, non-receptor Gaps = 0/5 (0%) isoforms (MACF1) gene, type 13 Query 1 exon 3 gi|119617338|gb|EAW96932. NSSKE 5 Length = 2486 1| ubiquitin specific NSSKE Score = 803 bits (405) peptidase 52, isoform Sbjct 916 Expect = 0.0 CRA_e NSSKE 920 Id = 405/405 (100%) gi|119585128|gb|EAW64724. Gaps = 0/405 (0%) 1| kinesin family member Strand = Plus/Plus 15 gi|62087388|dbj|BAD92141.1 protein tyrosine phosphatase, non-receptor type 13 12 1_G8 gi|51317362|ref|NM_0024 AMRASRRR 79 gi|4502919|ref|NP_001288.1| Score = 35.4 bits (76), Expect = 0.056  5 73.3| Homo sapiens FSSNARAPG cyclic nucleotide gated Id = 20/43 (46%), Positives = 23/43 (53%), myosin, heavy chain 9 GXHRPAGG channel beta 1 Gaps = 16/43 (37%) non-muscle (MYH9), GAGGGAGQ gi|45476939|sp|Q8IX07|FOG Query 39 mRNA HGADQRPA 1_HUMAN RPA--EEGQPADRPDQH--R------PEPGAQ----PRPEERE 67 Length = 7474 EEGQPADR Zinc finger protein ZFPM1 RP   EEG PA+ P++H  R      PEPG Q      PEERE Score = 626 bits (316) PDQHRPE (Friend of GATA protein 1) Sbjct 1201 Expect = 4e−177 PGAQPRPE gi|4210366|emb|CAA10317.1| RPEGEEEG-PAE-PEEHSVRICMSPGPEPGEQILSVKMPEERE 1241 Id = 335/342 (97%) ERECSAAA APC2 protein Score = 24.0 bits (49), Expect = 157 Gaps = 1/342 (0%) GTPEQGA gi|5031587|ref|NP_005874.1| Id = 17/49 (35%), Positives = 18/48 (37%), Strand = Plus/Plus adenomatosis polyposis coli 2 Gaps = 27/48 (56%) gi|21360802|gb|AAM49715.1| Query 42 hepatoma-derived growth EEGQPADRPDQH-------------R--PE-PGAQP---------RPE 64 factor-HGDF5 EEG  A  PDQH             R  PE PG+ P         RPE gi|21263499|sp|Q9BZ76|CNT Sbjct 1158 P3_HUMAN Contactin- EEGSAA--PDQHTHPREAATDPPAPRTPPEPPGSPPSSPPPASLCRPE 1203 associated protein-like 3 Score = 32.5 bits (69), Expect = 0.44 precursor (Cell recognition Id = 21/41 (51%), Positives = 22/41 (53%), molecule Caspr3) Gaps = 14/41 (34%) gi|6002605|gb|AAF00055.1 Query 44 transcription factor TBLYM GQPADRPDQHR--PEPGAQPRPEERECSAAAG---TPEQGA 79 gi|7341310|gb|AAF61243.1| GQPA+ PD  R  P PGA  R EE      AG   TPE GA T-cell-specific T-box Sbjct 625 transcription factor T-bet GQPAE-PDAPRSSPGPGA--R-EE-----GAGGAATPEDGA 656 gi|42490771|gb|AAH66122.1| Score = 32.0 bits (68), Expect = 0.59 BCR protein Id = 26/57 (45%), Positives = 28/57 (49%), gi|82546843|ref|NP_004318. Gaps 19/57 (33%) breakpoint cluster region Query 24 isoform 1 GGGAGGGAGQHGADQRFAEEGQPADRPDQHRPEP-GAQPR----PE-E--RECSAAA 72 GGGAGG AG H A  R  EEG          P P G++PR     E E  REC  AA Sbjct 1305 GGGAG-AGLHFAGHRRRREEG----------PAPTGSRPRGAADQELELLRECLGAA 1350 Score = 19.7 bits (39), Expect = 2966 Id = 5/6 (83%), Positives = 6/6 (100%), Gaps = 0/6 (0%) Query 37 DQRPAE 42 D+RPAE Sbjct 1658 DERPAE 1663 Score = 18.9 bits (37), Expect = 5340 Id = 10/16 (62%), Positives = 10/16 (62%), Gaps = 4/16 (25%) Query 13 ARAPGGXHRPAGGGAG 28 AR  GG   P GGGAG Sbjct 390 ARD-GG---PEGGGAG 401 Score = 17.6 bits (34), Expect = 12900 Id = 7/9 (77%), Positives = 7/9 (77%), Gaps = 0/9 (0%) Query 24 GGQAGGGAG 32 GG  GGGAG Sbjct 393 GGPEGGGAG 401

TABLE 6b Peptide sequences Description of Mimo- of the genes topes, Size that are in in-frame of Description of the Mimotope with T7 pep- sequences that Rank Clone clones 10 B gene tide Mimotopes mimic Region of similarity of peptide  7 2_D4 gi|37790795| WDCATACQ 29 gi|34535600|dbj|BAC87373.1| Score = 37.5 bits (81), gb|AY42221 PGSQRDSVS Unnamed protein product Expect = 0.003 1.1| KKKKKKGG gi|1350799|sp|P49646|YYY1 Id = 14/25 (56%), Homo sapiens XGKN Very very hypothetical Positives = 16/25 (64%), cholesteryl protein RMSA-1 Gaps = 9/25 (36%) ester trans- gi|8928194|sp|Q9UJU2|LEF1 Query 8 fer protein, Lymphoid enhancer binding QPGS         QRDSVSKKKKKK 23 plasma (CETP) factor 1 (LEF-1) +PGS         +RDSVSKKKKKK gene, com- gi|1703273|sp|P50579|AMPM Sbjct 127 plete cds 2_HUMAN EPGSCHSTPAWATERDSVSKKKKKK 151 Lengh = 25612 Initiation factor 2 assoc- Score = 16.8 bits (32), Score = 99.6 iated 67 kDa glycoprotein Expect = 5786 bits (50) (p67) (p67eIF2) Id = 6/11 (54%), Expect = 2e− gi|29337146|sp|Q9NQB0|TF7 Positives = 7/11 (63%), 18 L2_HUMAN Gaps = 0/11 (0%) Id = 50/50 Transcription factor 7- Query 13 (100%) like 2 RDSVSKKKKKK 23 Gaps = 0/50 gi|71153825|sp|Q9BQG0|MB R+ VS K  KK (0%) B1A_HUMAN Sbjct 85 Strand = Myb-binding protein 1A RNPVSTKSTKK 95 Plus/Minus E2F-associated phosphoprotein (EAPP) gi|12644154|sp|P20042|IF2B_(—) HUMAN Eukaryotic translation ini- tiation factor 2 subunit 2 Metastasis adhesion protein (Metadherin) Programmed cell death protein 11 (T cell-specific transcrip- tion factor 1-alpha) (TCF1- alpha) gi|2822161|gb|AAB97937.1| rab3 effector-like 8 2_D8 gi|37790795| WDCATA 29 gi|34535600|dbj|BAC87373.1| Score = 37.5 bits (81), gb|AY42221 CQPGSQ Unnamed protein product Expect = 0.003 1.1| RDSVSK gi|1350799|sp|P49646|YYY1 Id = 14/25 (56%), Homo sapiens KKKKKR Very very hypothetical Positives = 16/25 (64%), cholesteryl G protein RMSA-1 Gaps = 9/25 (36%) ester trans- gi|8928194|sp|Q9UJU2|LEF1 Query 8 fer protein, Lymphoid enhancer binding QPGS         QRDSVSKKKKKK 23 plasma (CETP) factor 1 (LEF-1) +PGS         +RDSVSKKKKKK gene, com- gi|1703273|sp|P50579|AMPM Sbjct 127 plete cds 2_HUMAN EPGSCHSTPAWATERDSVSKKKKKK 151 Lengh = 25612 Initiation factor 2 assoc- Score = 16.8 bits (32), Score = 99.6 iated 67 kDa glycoprotein Expect = 5786 bits (50) (p67) (p67eIF2) Id = 6/11 (54%), Expect = 2e− gi|29337146|sp|Q9NQB0|TF7 Positives = 7/11 (63%), 18 L2_HUMAN Gaps = 0/11 (0%) Id = 50/50 Transcription factor 7- Query 13 (100%) like 2 RDSVSKKKKKK 23 Gaps = 0/50 gi|71153825|sp|Q9BQG0|MB R+ VS K  KK (0%) B1A_HUMAN Sbjct 85 Strand = Myb-binding protein 1A RNPVSTKSTKK 95 Plus/Minus E2F-associated phosphoprotein (EAPP) gi|12644154|sp|P20042|IF2B_(—) HUMAN Eukaryotic translation ini- tiation factor 2 subunit 2 Metastasis adhesion protein (Metadherin) Programmed cell death protein 11 (T cell-specific transcrip- tion factor 1-alpha) (TCF1- alpha) gi|2822161|gb|AAB97937.1| rab3 effector-like 9 2_G gi|37790795| WDCATA 29 gi|34535600|dbj|BAC87373.1| Score = 37.5 bits (81), 11 gb|AY42221 CQPGSQ Unnamed protein product Expect = 0.003 1.1| RDSVSK gi|1350799|sp|P49646|YYY1 Id = 14/25 (56%), Homo sapiens KKKKKR Very very hypothetical Positives = 16/25 (64%), cholesteryl G protein RMSA-1 Gaps = 9/25 (36%) ester trans- gi|8928194|sp|Q9UJU2|LEF1 Query 8 fer protein, Lymphoid enhancer binding QPGS         QRDSVSKKKKKK 23 plasma (CETP) factor 1 (LEF-1) +PGS         +RDSVSKKKKKK gene, com- gi|1703273|sp|P50579|AMPM Sbjct 127 plete cds 2_HUMAN EPGSCHSTPAWATERDSVSKKKKKK 151 Lengh = 25612 Initiation factor 2 assoc- Score = 16.8 bits (32), Score = 99.6 iated 67 kDa glycoprotein Expect = 5786 bits (50) (p67) (p67eIF2) Id = 6/11 (54%), Expect = 2e− gi|29337146|sp|Q9NQB0|TF7 Positives = 7/11 (63%), 18 L2_HUMAN Gaps = 0/11 (0%) Id = 50/50 Transcription factor 7- Query 13 (100%) like 2 RDSVSKKKKKK 23 Gaps = 0/50 gi|71153825|sp|Q9BQG0|MB R+ VS K  KK (0%) B1A_HUMAN Sbjct 85 Strand = Myb-binding protein 1A RNPVSTKSTKK 95 Plus/Minus E2F-associated phosphoprotein (EAPP) gi|12644154|sp|P20042|IF2B_(—) HUMAN Eukaryotic translation ini- tiation factor 2 subunit 2 Metastasis adhesion protein (Metadherin) Programmed cell death protein 11 (T cell-specific transcrip- tion factor 1-alpha) (TCF1- alpha) gi|2822161|gb|AAB97937.1| rab3 effector-like 16 6_G gi|12584762| NSSGV 5 gi|119630243|gb|EAX09838.1 Score = 17.2 bits (33), 12 emb|AL3910 interferon (alpha, beta and Expect = 2312 01.12| omega) receptor 1, isoform Id = 5/5 (100%), Human DNA se- gi|119609949|gb|EAW89543. Positives = 5/5 (100%), quence from 1| lectin, galactoside- Gaps = 0/5 (0%) clone RP11- binding, soluble, 3 binding Query 1 477H21 on protein, isoform CRA_a NSSGV 5 chromosome 1 gi|119593835|gb|EAW73429. NSSGV Contains part 1| cadherin, EGF LAG Sbjct 154 of the PBX1 seven-pass G-type receptor NSSGV 158 gene for pre- 1 (flamingo homolog, B-cell leu- Drosophila), isoformCRA_b kemia trans- gi|56203586|emb|CA119162.1| cription fac- tripartite motif-containing tor 1 67 Length = 196730 Score = 65.9 bits (33) Expect = 2e− 08 Id = 59/71 (83%) Gaps = 0/71 (0%) Strand = Plus/Plus /gene = “PBX1” /product = “pre-B-cell leukemia transcription factor 1” 17 7_C1 gi|18072476| GRGGGGGR 33 gi|56566042|gb|AAV98357.1| Score = 37.1 bits (80), emb|AL3566 GGGGRIRR nucleolar protein family A Expect = 0.004 08.23| Human RKPQEPEK member 1 Id = 16/19 (84%), DNA sequence QRAEVQIQ gi|119615069|gb|EAW94663. Positives = 16/19 (84%), from clone G 1| CEBP-induced protein Gaps = 2/19 (10%) RP11-724N1 on Heterogeneous nuclear Query 1 chromosome 10 ribonucleoprotein U-like GRGG-GGGRGGGGGI-GGR 17 Contains the protein 1 GRGG GGGRGGGGG  GCR 5′ end of the (Adenovirus early region Sbjct 172 CNNM2 gene 1B-associated protein 5) GRGGRGGGRGGGGGFRGGR 190 for cyclin M2 (E1B-55 kDa-associated Score = 32.5 bits (69), and a CpG is- protein 5) (E1B-AP5) Expect = 0.11 land, com- Id = 13/19 (68%), plete se- Positives = 14/19 (73%), quence Gaps = 3/19 (15%) Length = Query 1 161053 GRGGGG---GRGGGGGIRR 16 Score = 971 GRGGG    GRGG GG+RR bits (490) Sbjct 8 Expect = 0.0 GRGGGAWGPGRGGAGGLRR 26 Id = 490/490 (100%), Gaps = 0/490 (0%) Strand = Plus/Minus 18 7_C4 gi|77744392| ADHKVRSL 11 gi|55961387|emb|CA117418.1 Score = 24.4 bits (50), gb|DQ23098 RPA cAMP responsive element Expect = 22 8.1| SW1/SNF binding protein-like 1 Id = 7/7 (100%), related, (24.4) Pos = 7/7 (100%), matrix assoc- gi|1705471|sp|P55107|BMP3 Gaps = 0/7 (0%) iated, actin B_HUMAN Query 5 dependent Bone morphogenetic protein VRSLRPA 11 regulator of 3b precursor (BMP-3b) VRSLRPA chromatin, Growth/differentiation Sbjct 151 subfamily b, factor 10 (GDF-10) (23.5) VRSLRPA 157 member 1 gene (Bone inducing protein) Score = 23.5 bits (48), Length = gi|12643326|sp|Q9ULV3|C1Z Expect = 54 50890 1_HUMAN Cip1-interacting Id = 9/15 (60%), Score = 125 zinc finger protein Pos = 11/15 (73%), bits (63) (Nuclear protein NP94) Gaps = 2/15 (13%) Expect = 7e− gi|62087674|dbj|BAD92284.1 Query 3 27 ELK1, member of ETS PNSSADHKVRSLRPA 17 Id = 63/63 oncogene family variant PN+SAD +VR  R A (100%) Ubiquitin specific protease Sbjct 280 Gaps = 0/63 42 PNNSADPRVR--RAA 292 (0%) Chondroitin-4-O- Score = 21.0 bits (42), Strand = sulfotransferase-3 Expect = 233 Plus/Plus Id = 7/11 (63%), Positives = 9/11 (81%), Gaps = 2/11 (18%) Query 2 DHKV--RSLRP 10 DHK+  +SLRP Sbjct 27 DHKIAKQSLRP 37 23 2_D3 gi|37552371| NSSLF 5 gi|119623720|gb|EAX03315.1 Score = 18.5 bits (36), ref|NT_0112 DEAH (Asp-Glu-Ala-His) Expect = 957 55.14| box polypeptide 16, isoform Id = 5/5 (100%), Hs19_11412 CRA_d Positives = 5/5 (100%), Homo sapiens gi|119618562|gb|EAW98156. Gaps = 0/5 (0%) chromosome 19 1| citron (rho-interacting, Query 1 genomic con- serine/threonine kinase NSSLF 5 tig, refer- 21), isoform CRA_b NSSLF ence assembly gi|5833114|gb|AAD53401.1|A Sbjct 265 Length = F107840_1 nuclear pore- NSSLF 269 7286004 associated protein Features in gi|119581856|gb|EAW61452. this part of 1| T-cell leukemia homeobox subject se- 3 quence: adenomatosis polyposis coli 2 Score = 793 bits (400) Expect = 0.0 Id = 411/416 (98%) Gaps = 0/416 (0%) Strand = Plus/Minus 24 3_D4 gi|555853|gb| PRQSFTLVA 9 gi|21749258|dbj|BAC03563.1 Score = 25.2 bits (52), U13369.1|H Unnamed protein product Expect = 11 SU13369 Human gi|14043145|gb|AAH07560.1| Id = 7/7 (100%), ribosomal DNA LASP1 protein Pos = 7/7 (100%), complete re- LIM and SH3 domain Gaps = 0/7 (0%) peating unit protein 1 (LASP-1) (MLN Query 2 Length = 50) RQSFTLV 8 42999 gi|46937163|emb|CAE45323. RQSFTLV Score = 850 1|LIM-nebulette alpha-1,3- Sbjct 25 bits (429) glucosyltransferase ALG8 RQSFTLV 31 Expect = 0.0 isoform b Score = 25.2 bits (52), Id = 431/432 Carbohydrate Expect = 17 (99%) sulfotransferase 6 Id = 7/9 (77%), Gaps = 0/432 Sortilin-related receptor Pos = 9/9 (100%), (0%) precursor Gaps = 0/9 (0%) Strand = Novel protein similar to Query 7 Plus/Plus mouse meiosis defective 1 PRQSFTLVA 15 gene Erythroid P+QSFT+VA differentiation-related Sbjct 58 factor 1 (EDRF1) PKQSFTMVA 66 30 9_G gi|17907166| RVGRKVQ 7 gi|8134304|sp|Q92667|AKAP Score = 22.3 bits (45), 8 emb|AL1333 1_HUMAN A kinase anchor Expect = 131 51.34| protein 1 (Protein kinase A Id = 7/8 (87%), Human DNA se- anchoring protein 1) (22.3) Positives = 7/8 (87%), quence from (Spermatid A-kinase anchor Gaps = 0/8 (0%) clone RP1- protein 84)(S-AKAP84) Query 6 90J20 on gi|22261|sp|P13765|2DOB_(—) SRVGRKVQ 13 chromosome HUMAN HLA class II SRV RKVQ 6p24.1-25.3 histocompatibility antigen, Sbjct 172 Contains the DO beta chain precursor SRVPRKVQ 179 5′ end of a gi|57209707|emb|CA141975.1 Score = 21.4 bits (43), putative Protein phosphatase 2 Expect = 235 novel gene, regulatory subunit B, beta Id = 6/6 (100%), the SERPINB9 Dual specificity A-kinase Positives = 6/6 (100%), gene for ser- anchoring protein 1 Gaps = 0/6 (0%) ine (or cy- Solute carrier family 2 Query 8 steine) VGRKVQ 13 proteinase VGRKVQ inhibitor Sbjct 117 clade B VGRKVQ 122 (ovalbumin) Score = 18.5 bits (36), member 9, a Expect = 954 novel gene, Id = 5/5 (100%), the SERPINB6 Positives = 5/5 (100%), gene for Gaps = 0/5 (0%) serine (or Query 3 cysteine) GRKVQ 7 proteinase GRKVQ inhibitor Sbjct 335 clade B GRKVQ 339 (ovalbumin) member 6, three puta- tive novel genes, the gene for NAD(P)H dehydrogenase quinone 2 (NQO2) and eight pre- dicted CpG islands, com- plete se- quence Length = 173670 Score = 559 bits (282) Expect = 3e− 156 Id = 314/320 (98%) Gaps = 4/320 (1%) Strand = Plus/Plus 33 2_G gi|14336700| LQPGRQSET 36 gi|34528883|dbj|BAC85594.1| Score = 55.4 bits (123), 4 gb|AE00646 PSQKKKKK Phsophorylase kinase Expect = 2e−07 4.1| Homo NLGGMGTG alpha/beta Id = 17/17 (100%), sapiens AHPFDPSTL gi|33415049|gb|AAQ18032.1| Positives = 17/17 (100%), 16p13.3 se- GS Transformation-related Gaps = 0/17 (0%) quence sec- protein 10 Query 1 tion 3 of 8 gi|34528883|dbj|BAC85594.1| LQPGRQSETPSQKKKKK 17 Length = unnamed protein product LQPGRQSETPSQKKKKK 256073 Phsophorylase kinase Sbjct 19 Features in alpha/beta LQPGRQSETPSQKKKKK 35 this part of Putative nucleolar Score = 44.3 bits (97), subject trafficking phosphoprotein Expect = 4e−04 sequence: SLC2A11 protein variant Id = 15/17 (88%), RAR (RAS like CLL associated antigen Positives = 15/17 (88%), GTPase) like KW-2 Gaps = 0/17 (0%) Score = 103 Ras-related protein H-Ras Query 1 bits (52) BCL2-associated LQPGRQSETPSQKKKKK 17 Expect = 8e− athanogene isoform 1L LQPG QSETPSQRK KK 20 (BAG-1) Sbjct 37 Id = 73/81 Lung cancer metastasis- LQPGLQSETPSQKKTKK 53 (90%) related protein (LCMRP1) Gaps = 0/81 (0%) Strand = Plus/Plus 36 5_G gi|119380763| HRGSPSNVG 24 gi|4837879|gb|AAD30731.1| Score = 25.7 bits (53), 3 gb|EF17744 AFRIGRESV immunoglobulin heavy Expect = 12 7.1|Homo KESLFY chain variable region Id = 8/13 (61%), sapiens iso- gi|30060232|gb|AAP13073.1| Positives = 10/13 (76%), late TA23 E3 ligase for inhibin Gaps = 0/13 (0%) mitochon- receptor Query 11 drion, com- gi|85682779|sp|Q9ULT8|HEC FRIGRESVKESLF 23 plete genome D1_HUMAN P I R++VK SLF Length = E3 ubiquitin-protein ligase Sbjct 64 16569 HECTD1 FTISRDNVKNSLF 76 Score = 105 (HECT domain-containing Score = 25.2 bits (52), bits (53) protein 1) Expect = 17 Expect = 1e− (E3 ligase for inhibin Id = 11/21 (52%), 20 receptor) (EULIR) Positives = 12/21 (57%), Id = 53/53 Gaps = 7/21 (33%) (100%) Query 9 Gaps = 0/53 GAFRIGR---ESVK----ESL 22 (0%) G FR+GR   E VK    ESL Strand = Sbjct 2123 Plus/Minus GEFRVGRLKHERVKVPRGESL 2143 Score = 15.1 bits (28), Expect = 19195 Id = 4/4 (100%), Positives = 4/4 (100%), Gaps = 0/4 (0%) Query 2 RGSP 5 RGSP Sbjct 288 RGSP 291 Score = 25.2 bits (52), Expect = 17 Id = 11/21 (52%), Positives = 12/21 (57%), Gaps = 7/21 (33%) Query 9 GAFRIGR---ESVK----ESL 22 G FR+GR   E VK    ESL Sbjct 2123 GEFRVGRLKHERVKVPRGESL 2143 43 10_D gi|119380763| NSSVGC 6 gi|55960173|emb|CA114537.1| Score = 17.2 bits (33), 6 gb|EF17744 HIV type I enhancer Expect = 2774 7.1|Homo binding protein 3 Id = 5/5 (100%), sapiens iso- gi|119617590|gb|EAW97184.1| Positives = 5/5 (100%), late TA23 Mdm4, transformed 3T3 Gaps = 0/5 (0%) mitochon- cell double minute 1, p53 Query 1 drion, com- binding protein (mouse), NSSVG 5 plete genome isoform CRA_b NSSVG Length = gi|119604594|gb|EAW84188.1| Sbjct 30 16569 IVRAB3D, member RAS NSSVG 34 Score = 446 oncogene family, isoform bits (225), CRA_a Expect = 4e− 123 Id = 261/261 (100%), Gaps = 0/261 (0%) Strand = Plus/Minus 56 1_G gi|37790795| WDCATACQ 56 gi|34535600|dbj|BAC87373.1| Score = 37.5 bits (81), 4 gb|AY42221 PGSQRDSVS Unnamed protein product Expect = 0.019 1.1| KKKKKKGG gi|1350799|sp|P49646|YYY1 Id = 14/25 (56%), Homo sapiens XEKXXGXG HUMAN Positives = 16/25 (64%), cholesteryl XFFXXXKX Very very hypothetical Gaps = 9/25 (36%) ester trans- XXXFXRXF protein RMSA-1 Query 8 fer protein, XPXFXXK gi|537530|gb|AAB68050.1| QPGS---------QRDSVSKKKKKK 23 plasma (CETP) chromnosomal protein +PGS---------+RDSVSKKKKKK gene, com- Lymphoid enhancer binding Sbjct 127 plete cds factor 1 (LEF-1) EPGSCHSTPAWATERDSVSKKKKKK 151 Length = (T cell-specific transcrip- Score = 27.4 bits (57), 25612 tion factor 1-alpha) (TCF1- Expect = 22 Score = 99.6 alpha) Id = 8/10 (80%), bits (50) Methionine aminopeptidase Positives = 10/10 (100%), Expect = 1e− 2 (MetAP 2) (Initiation Gaps = 0/10 (0%) 18 factor 2 associated 67 kDa Query 14 Id = 50/50 glycoprotein) DSVSKKKKKK 23 (100%) (p67) (p67eIF2) ++VSKKKKKK Gaps = 0/50 Sbjct 265 (0%) ETVSKKKKKK 274 Strand = Plus/Minus 72 10_(—) gi|14329907| ARQVF 5 gi|57162452|emb|CA140479.1 Score = 19.3 bits (38), H11 emb|AL1624 BRCA2 (Breast cancer 2, Expect = 378 31.17| Human early onset) (19.3) Id = 5/5 (100%), DNA sequence gi|4502451|ref|NP_000050.1| Positives = 5/5 (100%), fr clone Breast cancer 2, early on- Gaps = 0/5 (0%) RP11-46A10 on set Query 1 chromosome gi|13540359|gb|AAK9432.1| ARQVF 5 1q25.2-31.1 Mutant early onset breast ARQVF Contains 3′ cancer susceptibility pro- Sbjct 329 end of XPR1 tein ARQVF 333 gene for xen- gi|21361458|ref|NP_055601.2 otropic and Rho guanine nucleotide polytropic exchange factor (GEF) 17 retrovirus gi|15987489|gb|AAL11991.1| receptor Tumor endothelial marker 4 Length = gi|51477779|ref|XP_067076.3 139006 PREDICTED: similar to Score = 170 testis specific protein, Y- bits(86) linked 2 Expect = 7e− Crn (crooked neck-like 1) 40 Inhibitor of Bruton's Id = 117/122 tyrosine kinase (16.8) (95%) Gaps = 0/122 (0%) Strand = Plus/Minus 74 2_C1 gi|37790795| WDCATA 29 gi|34535600|dbj|BAC87373.1| Score = 37.5 bits (81), 2 gb|AY42221 CQPGSQ Unnamed protein product Expect = 0.003 1.1|Homo RDSVSK gi|1350799|sp|P49646|YYY1 Id = 14/25 (56%), sapiens KKKKKR Very very hypothetical Positives = 16/25 (64%), cholesteryl G protein RMSA-1 Gaps = 9/25 (36%) ester trans- gi|8928194|sp|Q9UJU2|LEF1 Query 8 fer protein, Lymphoid enhancer binding QPGS---------QRDSVSKKKKKK 23 plasma (CETP) factor 1 (LEF-1) +PGS         +RDSVSKKKKKK gene, com- gi|1703273|sp|P50579|AMPM Sbjct 127 plete cds 2_HUMAN EPGSCHSTPAWATERDSVSKKKKKK 151 Length = Initiation factor 2 assoc- Score = 16.8 bits (32), 25612 iated 67 kDa glycoprotein Expect = 5786 Score = 99.6 (p67) (p67eIF2) Id = 6/11 (54%), bits (50) gi|29337146|sp|Q9NQB0|TF7 Positives = 7/11 (63%), Expect = 2e− L2_HUMAN Gaps = 0/11 (0%) 18 Transcription factor 7- Query 13 Id = 50/50 like 2 RDSVSKKKKKK 23 (100%) gi|71153825|sp|Q9BQG0|MB R+ VS K  KK Gaps = 0/50 B1A_HUMAN Sbjct 85 (0%) Myb-binding protein 1A RNPVSTKSTKK 95 Strand = E2F-associated Plus/Minus phosphoprotein (EAPP) gi|12644154|sp|P20042|IF2B_(—) HUMAN Eukaryotic translation initiation factor 2 subunit 2 Metastasis adhesion protein (Metadherin) Programmed cell death protein 11 (T cell-specific transcrip- tion factor 1-alpha) (TCF1- alpha) Methionine aminopeptidase 2 (MetAP 2) (Peptidase M 2) (Initiation factor 2 associated 67 kDa glycoprotein) (p67) (p67eIF2) gi|2822161|gb|AAB97937.1| rab3 effector-like 91 2_F3 gi|37790795| WDCATA 29 gi|34535600|dbj|BAC87373.1| Score = 37.5 bits (81), 2 gb|AY42221 CQPGSQ Unnamed protein product Expect = 0.003 1.1|Homo RDSVSK gi|1350799|sp|P49646|YYY1 Id = 14/25 (56%), sapiens KKKKKR Very very hypothetical Positives = 16/25 (64%), cholesteryl G protein RMSA-1 Gaps = 9/25 (36%) ester trans- gi|8928194|sp|Q9UJU2|LEF1 Query 8 fer protein, Lymphoid enhancer binding QPGS---------QRDSVSKKKKKK 23 plasma (CETP) factor 1 (LEF-1) +PGS         +RDSVSKKKKKK gene, com- gi|1703273|sp|P50579|AMPM Sbjct 127 plete cds 2_HUMAN EPGSCHSTPAWATERDSVSKKKKKK 151 Length = Initiation factor 2 assoc- Score = 16.8 bits (32), 25612 iated 67 kDa glycoprotein Expect = 5786 Score = 99.6 (p67) (p67eIF2) Id = 6/11 (54%), bits (50) gi|29337146|sp|Q9NQB0|TF7 Positives = 7/11 (63%), Expect = 2e− L2_HUMAN Gaps = 0/11 (0%) 18 Transcription factor 7- Query 13 Id = 50/50 like 2 RDSVSKKKKKK 23 (100%) gi|71153825|sp|Q9BQG0|MB R+ VS K  KK Gaps = 0/50 B1A_HUMAN Sbjct 85 (0%) Myb-binding protein 1A RNPVSTKSTKK 95 Strand = E2F-associated Plus/Minus phosphoprotein (EAPP) gi|12644154|sp|P20042|IF2B_(—) HUMAN Eukaryotic translation initiation factor 2 subunit 2 Metastasis adhesion protein (Metadherin) Programmed cell death protein 11 (T cell-specific transcrip- tion factor 1-alpha) (TCF1- alpha) Methionine aminopeptidase 2 (MetAP 2) (Peptidase M 2) gi|2822161|gb|AAB97937.1| rab3 effector-like 92 2_H gi|37790795| WDCATA 29 gi|34535600|dbj|BAC87373.1| Score = 37.5 bits (81), 2 gb|AY42221 CQPGSQ Unnamed protein product Expect = 0.003 1.1|Homo RDSVSK gi|1350799|sp|P49646|YYY1 Id = 14/25 (56%), sapiens KKKKKR Very very hypothetical Positives = 16/25 (64%), cholesteryl G protein RMSA-1 Gaps = 9/25 (36%) ester trans- gi|8928194|sp|Q9UJU2|LEF1 Query 8 fer protein, Lymphoid enhancer binding QPGS---------QRDSVSKKKKKK 23 plasma (CETP) factor 1 (LEF-1) +PGS         +RDSVSKKKKKK gene, com- gi|1703273|sp|P50579|AMPM Sbjct 127 plete cds 2_HUMAN EPGSCHSTPAWATERDSVSKKKKKK 151 Length = Initiation factor 2 assoc- Score = 16.8 bits (32), 25612 iated 67 kDa glycoprotein Expect = 5786 Score = 99.6 (p67) (p67eIF2) Id = 6/11 (54%), bits (50) gi|29337146|sp|Q9NQB0|TF7 Positives = 7/11 (63%), Expect = 2e− L2_HUMAN Gaps = 0/11 (0%) 18 Transcription factor 7- Query 13 Id = 50/50 like 2 RDSVSKKKKKK 23 (100%) gi|71153825|sp|Q9BQG0|MB R+ VS K  KK Gaps = 0/50 B1A_HUMAN Sbjct 85 (0%) Myb-binding protein 1A RNPVSTKSTKK 95 Strand = E2F-associated Plus/Minus phosphoprotein (EAPP) gi|12644154|sp|P20042|IF2B_(—) HUMAN Eukaryotic translation initiation factor 2 subunit 2 Metastasis adhesion protein (Metadherin) Programmed cell death protein 11 (T cell-specific transcrip- tion factor 1-alpha) (TCF1- alpha) Methionine aminopeptidase 2 (MetAP 2) (Peptidase M 2) gi|2822161|gb|AAB97937.1| rab3 effector-like 10 5_D4 gi|16306514| YKQRRRVK 30 gi|16741704|gb|AAH16648.1| Score = 28.2 bits (59),  4 gb|AC00996 REHPNRKQ Fos-related antigen 1 (FOS- Expect = 2.0 7.8|Homo LKDILLASH like antigen 1) (28.2) Id = 13/31 (41%), sapiens BAC GGPSP gi|29792148|gb|AAH50283.1| Pos = 16/31 (51%), clone RP11- WASF3 protein (27.8) Gaps = 11/31 (35%) 401O19 gi|59798973|sp|Q9H0K1|SN1 Query3 from 2 L2_HUMAN QRRRVKREHPN----------RKQLKDILLA 23 Length = Serine/threonine protein +RRRV+RE  N          RK+L D L A 176206 kinase SNF1-like kinase 2 Sbjct106 Score = 694 gi|7380440|sp|P46100|ATRX ERRRVRRER-NKLAAAKCRNRRKELTDFLQA 135 bits (350) _HUMAN Score = 27.8 bits (58), Expect = 0.0 Transcriptional regulator Expect = 2.0 Id = 350/350 ATRX (X-linked helicase II) Id = 10/19 (52%), (100%) (X-linked nuclear protein) Pos = 13/19 (68%), Gaps = 0/350 Fonconi anemia group A Gaps = 4/19 (21%) (0%) Protein (25.7) Query 2 Strand = E2F dimerization partner 2 KQRRRVKRE----HPNRKQ 16 Plus/Minus Dehydrogenase/reductase K++RR KRE    +PNR Q SDR family member 7 Sbjct 174 Retinal short chain KEKRRQKREKHKLNPNRNQ 192 dehydrogenase/reductase 4 Score = 26.9 bits (56), Expect = 4.9 Id = 9/14 (64%), Pos = 11/14 (78%), Gaps = 2/14 (14%) Query 17 LKDILLASHGGPSP 30 LKDI+LA+   PSP Sbjct 535 LKDIMLANQ--PSP 546  7 2_D4 gi|37790795| WDCATACQ 29 gi|34535600|dbj|BAC87373.1| Score = 37.5 bits (81), gb|AY42221 PGSQRDSVS Unnamed protein product Expect = 0.003 1.1|Homo KKKKKKGG gi|1350799|sp|P49646|YYY1 Id = 14/25 (56%), sapiens XGKN Very very hypothetical Positives = 16/25 (64%), cholesteryl protein RMSA-1 Gaps = 9/25 (36%) ester trans- gi|8928194|sp|Q9UJU2|LEF1 Query 8 fer protein, Lymphoid enhancer binding QPGS---------QRDSVSKKKKKK 23 plasma (CETP) factor 1 (LEF-1) +PGS         +RDSVSKKKKKK gene, com- gi|1703273|sp|P50579|AMPM Sbjct 127 plete cds 2_HUMAN EPGSCHSTPAWATERDSVSKKKKKK 151 Length = Initiation factor 2 assoc- Score = 16.8 bits (32), 25612 iated 67 kDa glycoprotein Expect = 5786 Score = 99.6 (p67) (p67eIF2) Id = 6/11 (54%), bits (50) gi|29337146|sp|Q9NQB0|TF7 Positives = 7/11 (63%), Expect = 2e− L2_HUMAN Gaps = 0/11 (0%) 18 Transcription factor 7- Query 13 Id = 50/50 like 2 RDSVSKKKKKK 23 (100%) gi|71153825|sp|Q9BQG0|MB R+ VS K  KK Gaps = 0/50 B1A_HUMAN Sbjct 85 (0%) Myb-binding protein 1A RNPVSTKSTKK 95 Strand = E2F-associated Plus/Minus phosphoprotein (EAPP) gi|12644154|sp|P20042|IF2B_(—) HUMAN (Metadherin) Programmed cell death protein 11 (T cell-specific transcrip- tion factor 1-alpha) (TCF1- alpha) Methionine aminopeptidase 2 (MetAP 2) (Peptidase M 2) (Iniation factor 2 associated 67 kDa glycoprotein) (p67)(p67eIF2) gi|2822161|gb|AAB97937.1| rab3 effector-like  8 2_D8 gi|37790795| WDCATA 29 gi|34535600|dbj|BAC87373.1| Score = 37.5 bits (81), gb|AY42221 CQPGSQ Unnamed protein product Expect = 0.003 1.1|Homo RDSVSK gi|1350799|sp|P49646|YYY1 Id = 14/25 (56%), sapiens KKKKKR Very very hypothetical Positives = 16/25 (64%), cholesteryl G protein RMSA-1 Gaps = 9/25 (36%) ester trans- gi|8928194|sp|Q9UJU2|LEF1 Query 8 fer protein, Lymphoid enhancer binding QPGS---------QRDSVSKKKKKK 23 plasma (CETP) factor 1 (LEF-1) +PGS         +RDSVSKKKKKK gene, com- gi|1703273|sp|P50579|AMPM Sbjct 127 plete cds 2_HUMAN EPGSCHSTPAWATERDSVSKKKKKK 151 Length = Initiation factor 2 assoc- Score = 16.8 bits (32), 25612 iated 67 kDa glycoprotein Expect = 5786 Score = 99.6 (p67) (p67eIF2) Id = 6/11 (54%), bits (50) gi|29337146|sp|Q9NQB0|TF7 Positives = 7/11 (63%), Expect = 2e− L2_HUMAN Gaps = 0/11 (0%) 18 Transcription factor 7- Query 13 Id = 50/50 like 2 RDSVSKKKKKK 23 (100%) gi|71153825|sp|Q9BQG0|MB R+ VS K  KK Gaps = 0/50 B1A_HUMAN Sbjct 85 (0%) Myb-binding protein 1A RNPVSTKSTKK 95 Strand = E2F-associated Plus/Minus phosphoprotein (EAPP) gi|12644154|sp|P20042|IF2B_(—) HUMAN Eukaryotic translation initiation factor 2 subunit 2 Metastasis adhesion protein (Metadherin) Programmed cell death protein 11 (T cell-specific transcrip- tion factor 1-alpha) (TCF1- alpha) Methionine aminopeptidase 2 (MetAP 2) (Peptidase M 2) (Iniation factor 2 associated 67 kDa glycoprotein) (p67)(p67eIF2) gi|2822161|gb|AAB97937.1| rab3 effector-like

TABLE 6b Peptide sequences Description of Mimo- of the genes topes, Size that are in in-frame of Description of the Mimotope with T7 pep- sequences that Rank Clone clones 10 B gene tide Mimotopes mimic Region of similarity of peptide  3 10_C gi|19807899| EMKRHIST 37 gi|585487|sp| Score = 27.8 bits (58), Expect = 5.1 3 gb|AC11076 LRWKTCLN Q07325|SCYB9_HUMAN Id = 14/28 (50%), Pos = 17/28 (60%) 9.2|Homo ANMKELLE Small inducible Gaps = 11/28 (39%) sapiens BAC IKVTGKIRY cytokine B9 precur- Query 12 clone RP11- NQGL sor (CXCL9) ISTLRWK----TCLN---ANMKELLEIK 32 141B14 from Gamma interferon I+TL  K    TCLN   A++ KEL IK 2, complete induced monokine Sbjct 64 sequence (MIG) (27.8) IATL--KNGVQTCLNPDSADVKEL--IK 87 Length = gi|62898822|dbj| Score = 26.1 bits (54), Expect = 9.0 134317 BAD97265.1 Id = 9/13 (69%), Pos = 12/13 (92%), Score = 599 Serologically de- Gaps = 1/13 (7%) bits (302) fined colon cancer Query 19 Expect = 2e− antigen 33 variant MKELLEIKVTGKI 31 168 Antigen NY-CO-33 M+EL+E KVTGK+ Id = 312/317 (26.1) Sbjct 270 (98%) gi|31077164|sp| MEELVE-KVTGKV 281 Gaps = 0/317 O95757|HS74 L_HUMAN Score = 24.0 bits (49), Expect = 39 (0%) Heat shock 70 kDa Id = 6/6 (100%), Pos = 6/6 (100%), Strand = protein 4L Gaps = 0/6 (0%) Plus/Plus gi|21315086|gb| Query 8 AAH30792.1| TLRWKT 13 Cyclin-dependent TLRWKT kinase 5, regula- Sbjct 403 tory subunit 1 TLRWKT 408 Transient receptor potential cation channel subfamily V member 3 Vanilloid receptor-like 3 (VRL-3) Dehydrodolichyl diphosphate synthase Tyrosyl-tRNA synthetase Putative mitochon- drial outer mem- brane protein import receptor  5 10_G gi|21686938| CINMDSPPK 11 gi|57208724|emb| Score = 24.8 bits (51), Expect = 338 8 gb|AC11605 QC CAI42568.1 Id = 6/7 (85%), Pos = 7/7 (100%), 0.3|Homo GNAS complex locus Gaps = 0/7 (0%) sapiens BAC (OTTHUMP- Query 2 clone RP11- 00000031737) INMDSPP 8 427F2 from 2, Guanine nucleotide +NMDSPP complete se- binding protein, Sbjct 195 quence alpha stimulating VNMDSPP 201 Length = activity polypep- Score = 22.3 bits (45), Expect = 131 165225 tide 1 (24.8) Id = 7/9 (77%), Pos = 7/9 (77%), Score = 343 gi|68565390|sp| Gaps = 2/9 (22%) bits (173) Q14676|MDC1_HUMAN Query 4 Expect = 3e− Mediator of DNA da- MDSPP--KQ 10 92 mage checkpoint MDSPP  KQ Id = 179/182 protein 1 (Nuclear Sbjct 1818 (98%) factor with BRCT MDSPPHQKQ 1826 Gaps = 0/182 domains 1) Score = 22.3 bits (45), Expect = 131 (0%) gi|50400452|sp| Id = 8/12 (66%), Pos = 8/12 (66%), Strand = Q7Z569|BRAP Gaps = 2/12 (16%) Plus/Plus HUMAN Query 1 BRCA1-associated CINM--DSPPKQ 10 protein (Impedes CIN   DSP KQ mitogenic signal Sbjct 110 propagation)(IMP) CINAAPDSPSKQ 121 gi|739072|prf| 12002263A E1A-assoc protein gp130 Retinoblastoma-like protein 2 gi|55957867|emb| CA113220.1 E74-like factor 1 (ets domain trans- cription factor) RUN and SH3 domain containing protein 2 Ezrin-radixin- moesin binding phosphoprotein 50 Impedes mitogenic signal propagation (IMP) Membrane-associated nucleic acid bind- ing protein E1A- associated protein gp130 13 6_C8 gi|21263318| GAGWEWV 7 gi|18089035|gb| Score = 23.1 bits (47), Expect = 38 gb|AC10444 AAH20586.1| Id = 5/5 (100%), Pos = 5/5 (100%), 1.2|Homo SERS14 protein Gaps = 0/5 (0%) sapiens (23.1) Query 3 chromosome 3 gi|49256613|gb| GWEWV 7 clone RP11- AAH73912.1| GWEWV 901H12, com- ACCN4 protein Sbjct 946 plete se- (22.7) GWEWV 950 quence Regenerating islet Score = 22.7 bits (46), Expect = 50 Length = derived protein 3 Id = 5/5 (100%), Positives = 5/5 (100%), 177320 alpha precursor Gaps = 0/5 (0%) Score = 262 Pancreatitis assoc- Query 2 bits (132) iated protein 1 AGWEW 6 Expect = 4e− (PAP) (21.8) AGWEW 67 ELAM-1 ligand Sbjct 255 Id = 160/171 fucosyltransferase AGWEW 259 (93%) Mitochondrial ribo- Gaps = 1/171 somal protein (0%) bMRP64 Strand = Plus/Minus 14 6_D4 gi|15004913| PLCLASLLS 57 gi|57209194|emb| Score = 32.9 bits (70), Expect = 0.19 gb|AC00947 FIVCLFHFR CA141407.1 Id = 8/9 (88%), Positives = 9/9 (100%), 5.5|Homo YLPTILLPP Dedicator of cyto- Gaps = 0/9 (0%) sapiens BAC ILKHKCNDR kinesis 11(32.9) Query 12 clone RP11- MHLTCFGS DOCK11 protein VCLFHFRYL 20 285F23 from AKALMYSL Cdc42-associated VCLFHFRY+ 2, complete SNNRC guanine nucleotide Sbjct 1319 sequence exchange factor VCLFHFRYM 1327 Score = 835 ACG|DOCK11 (32.9) Score = 30.3 bits (64), Expect = 1.7 bits(421) gi|13634012|sp| Id = 14/44 (31%), Pos = 20/44 (45%), Expect = 0.0 Q15884|CI61_HUMAN Gaps = 21/44 (47%) Id = 424/425 Protein C9orf61 6 (99%) (Protein X123) SPLCLA-------SLLSFIVCLFHFR--------------YLPT Gaps = 0/425 (30.3) 28 (0%) Probable G-protein S +C+A       S+LSF+VC F +R              +LPT Strand = coupled receptor 37 Plus/Minus 113 precursor SRMCMAISICQMLSMLSFVVCAFRYRHMFKRGWPMGTCCLFLPT (G-protein coupled 80 receptor PGR23) (29.5) Seven transmemhrane helix receptor (26.9) Taste receptor type 2 member 7 (T2R7) (26.5) Wingless-type MMTV integration site family (26.5) 20 10_B gi|21747795| NSFHN 5 gi|119626686|gb| Score = 20.2 bits (40), Expect = 295 11 gb|AC12486 EAX06281.1| Id = 5/5 (100%), Positives = 5/5 (100%), 4.3| Homo hCG21296, isoform Gaps = 0/5 (0%) sapiens BAC CRA_c Query 1 clone RP11- gi|119615394|gb| NSFHN 5 570J4 from 4, EAW94988.1 NSFHN complete se- ubiquitin specific Sbjct 310 quence peptidase 48, iso- NSFRN 314 Length = form CRA_b 166797 gi|119584494|gb| Score = 224 EAW64090.1 bits (113) solute carrier Expect = 4e− family 6 56 (neurotransmitter Id = 113/113 transporter, GABA), (100%) member 1, isoform Gaps = 0/113 CRA_a (0%) Strand = Plus/Plus 25 9_C4 gi|22657585| GITGSRPA 12 gi|119623989|gb| Score = 29.9 bits (63), Expect = 0.68 gb|AC09156 WPTW EAX03584.1| Id = 7/7 (100%), Positives = 7/7 (100%), 4.12|Homo cAMP responsive Gaps = 0/7 (0%) sapiens element binding Query 6 chromosome protein-like 1, RPAWPTW 12 11, clone isoform CRA_d RPAWPTW RP11-732A19, gi|14250004|gb| Sbjct 312 complete AAH08394.1| RPAWPTW 318 sequence CREBL1 protein Length = gi|119629047|gb| 211735 EAX08642.1| Score = 317 hCG1816309 bits (160) gi|119625915|gb| Expect = 4e− EAX05510.1| 84 homeodomain-only Id = 193/204 protein, isoform (94%) CRA_i Gaps = 3/204 gi|24286115|gb| (1%) AAN46678.1| Strand = hypothetical Plus/Plus protein HGRHSV1 gi|21751592|dbj| BAC03997.1| unnamed protein product 28 7_C3 gi|11493153| VRLVRTEE 23 gi|55859712|emb| Score = 29.1 bits (61), Expect = 0.89 emb|AL1185 RLELRTRS CA110983.1 Id = 7/7 (100%), Positives = 7/7 (100%), 23.18| WNWGLVQ Glucosidase beta 2 Gaps = 0/7 (0%) HSJ1031J8 (29.1) Query 15 Human DNA gi|55663314|emb| RSWNWGL 21 sequence from CAH72550.1|Asp RSWNWGL clone RP5- (abnormal spindle)- Sbjct 215 1031J8 like, microcephaly RSWNWGL 221 on chromosome associate Score = 28.6 bits (60), Expect = 1.2 20 gi|55663911|emb| Id = 10/18 (55%), Pos = 14/18 (77%), Contains a CAH70187.1|Ino- Gaps = 3/18 (16%) putative sitol 1,4,5-tris- Query 1 novel gene, phosphate 3-kinase VRLVRTEERLELRTRSWN 18 complete B VRLVRT   +EL T++W+ sequence gi|97536946|sp| Sbjct 226 Length = Q9C0D0|PHAR1_HUMAN VRLVRT---MELLTQNWD 240 155213 Phosphatase and Score = 26.9 bits (56), Expect = 3.9 Score = 1304 actin regulator 1 Id = 10/18 (55%), Pos = 12/18 (66%), bits (658) (26.1) Gaps = 6/18 (33%) Expect = 0.0 gi|37779178|gb| Query 3 Id = 707/724 AAO73817.1| LV---RTEERLELRTRSW 17 (97%) HBeAg-binding pro- LV   R+EER   RT+SW Gaps = 2/724 tein RPEL repeat Sbjct 191 (0%) containing 1 LVQGARSEER---RTKSW 205 Strand = Neprilysin (Neutral Plus/Plus endopeptidase) (NEP) (26.1) Selectin-like pro- tein Common acute lymphoblastic leu- kemia Ag precursor Ephrin type B re- ceptor 4 precursor (24.4) Tyrosine protein kinase rcceptor HTK Activated met onco- gene (24) Tpr (translocated promoter region to activated MET onco- gene) Serine/threonine protein phosphatase w/EF-hands-1 Sad1/unc-84-like protein 2 Rab5-interacting protein 34 9_G1 gi|15668084| VFNCWF 6 gi|41351160|gb| Score = 24.4 bits (50), Expect = 13 2 gb|AC09257 AAH65552.1| Id = 5/6 (83%), Positives = 6/6 (100%), 3.2|Homo IARS Gaps = 0/6 (0%) sapiens BAC proteingi|55957316| Query 1 clone RP11- emb|CAI16202.1| VFNCWF 6 107 from 2, isoleucine-tRNA VF+CWF complete synthetase Sbjct 413 sequence gi|31874258|emb| VFDCWF 418 Score = 281 CAD98022.1| Score = 20.6 bits (41), Expect = 188 bits (142) hypothetical Id = 4/5 (80%), Positives = 5/5 (100%), Expect = 4e− protein Gaps = 0/5 (0%) 73 gi|32307152|ref| Query 1 Id = 142/142 NP_000907.2| VFNCW 5 (100%) oxytocin receptor VF+CW Gaps = 0/142 Sbjct 184 (0%) VFDCW 188 Strand = Plus/Plus 46 1_D1 gi|34194579| HLKKKKKK 16 gi|119608512|gb| Score = 30.8 bits (65), Expect = 0.37 1 gb|BC05296 KRGTGXLR EAW88106.1 Id = 9/9 (100%), Positives = 9/9 (100%), 3.21|Homo hCG2040846 Gaps = 0/9 (0%) sapiens Snf2- gi|12802996|gb| Query 1 related CBP AAH01198.1| HLKKKKKKK 9 activator Unknown (protein HLKKKKKKK protein, mRNA for IMAGE: Sbjct 959 (cDNA clone 3355871) HLKKKKKKK 967 IMAGE: gi|119586549|gb| Score 29.9 bits (63), Expect = 0.67 5785548), EAW66145.1 Id = 9/9 (100%), Positives = 9/9 (100%), complete cds zinc finger homeo- Gaps = 0/9 (0%) Length = box 2, Query 3 4805 isoform CRA_d KKKKKKKRG 11 Score = 63.9 gi|1703273|sp| KKKKKKKRG bits(32) P50579|AMPM Sbjct 100 Expect = 8e09 2_HUMAN KKKKKKKRG 108 ld = 32/32 Methionine amino- Score = 29.5 bits (62), Expect = 0.90 (100%) peptidase Id = 9/10 (90%), Positives = 10/10 (100%), Gaps = 0/32 2 (MetAP 2) Gaps = 0/10 (0%) (0%) (Peptidase M 2) Query 2 Strand = (Initiation factor LKKKKKKKRG 11 Plus/Minus 2-associated 67 kDa LKKKKKKK+G glycoprotein) (P67) Sbjct 490 (p67eIF2) LKKKKKKKKG 499 gi|687243|gb| AAC63402.1| eIF-2-associated p67 homolog gi|49616863|gb| AAT67238.1| ubiquitin specific protease 42 50 6_H8 gi|4753288| TNSIFGSLE 11 gi|29421174|dbj| Score = 27.4 bits (57), Expect = 2.8 gb|AC004828. SY BAA25472.2 Id = 8/10 (80%), Pos = 9/10 (90%), 2|AC004828 KIAA0546 protein Gaps = 0/10 (0%) Homo sapiens (27.4) Query 1 clone gi|34098393|sp| TNSIFGSLES 10 DJ0514A23, O95388|WISP TNS+FG LES complete se- 1_HUMAN Sbjct 736 quence WNT1 inducible TNSVFGGLES 745 Length = signaling pathway Score = 22.7 bits (46), Expect = 98 183249 protein 1 precursor Id = 7/10 (70%), Pos = 7/10 (70%), Score = 670 (WISP-1) (22.7) Gaps = 0/10 (0%) bits (338), gi|11055594|gb| Query 2 Expect = 0.0 AAG28165.1| NSIFGSLESY 11 Id = 351/357 Paraneoplastic N IF  LESY (98%), associated brain- Sbjct 350 Gaps = 0/357 testis-cancer NDIFADLESY 359 (0%) antigen Score = 21.8 bits (44), Expect = 176 Strand = gi|52782735|sp| Id = 6/7 (85%), Positives = 7/7 (100%), Plus/Minus Q7Z628|ARH Gaps = 0/7 (0%) G8_HUMAN Query 4 Neuroepithelial IFGSLES 10 cell transforming +FGSLES gene 1 protein Sbjct 240 (p65 Net1 proto- VFGSLES 246 oncogene) (Rho guanine nucle- otide exchange factor 8) SET binding factor 2 (22.3) Sentrin-specific protease 7 Membrane-associated transporter protein Melanoma antigen AIM1 Solute carrier family 2 Sentrin-specific protease 7 (SUMO-1 specific protease 2) Nuclear pore complex pro- tein Nup98-Nup96 precursor 51 9_C1 gi|13277000| PVLSSHKNE 18 gi|41150972|ref| Score = 23.1 bits (47), Expect = 52 2 emb|AL1387 ARDKGKCH XP_371160.1| Id = 8/17 (47%), Positives = 10/17 (58%), 04.12| P PREDICTED: similar Gaps = 5/17 (29%) Human DNA se- to 60S ribosomal Query 7 quence from protein L21 KNEARDKG-----KCHP 18 clone RP11- gi|20140092|sp| K EA++KG     KC P 417C20 on Q92552|RT27 Sbjct 116 chromosome _HUMAN KKEAKEKGTWVQLKCQP 132 13 Contains Mitochondrial 28S Score = 22.3 bits (45), Expect = 94 the 5′ end of ribosomal protein Id = 7/9 (77%), Positives = 8/9 (88%), gene KIAA1016 S27 (S27mt) Gaps = 0/9 (0%) and a CpG is- (MRP-S27) Query 5 land, com- gi|62906888|sp| SHKNEARDK 13 plete se- Q86T65|DAA SHK EAR+K quence M2_HUMAN Sbjct 37 Length = Disheveled associ- SHKWEAREK 45 165120 ated activator of Score = 22.3 bits (45), Expect = 129 Score = 236 morphogenesis 2 Id = 6/6 (100%), Positives = 6/6 (100%), bits (119), Adapter-related Gaps = 0/6 (0%) Expect = 3e− protein complex 4 Query 8 59 mu 1 subunit Se- NEARDK 13 Identities = ven transmembrane NEARDK 121/122 helix receptor Sbjct 932 (99%), gi|29164873|gb| NEARDK 937 Gaps = 0/122 AAO65168.1| (0%) sarcoma antigen Strand = NY-SAR-27 Plus/Minus gi|739072|prf| 2002263A E1A-associated protein gp130 gi|397148|emb| CAA52671.1| Rb2|p130 protein gi|6686330|sp| Q08999|RBL2_(—) HUMAN Retinoblastoma-like protein 2; Rb- related p130 protein Proteasome subunit beta type 4 precursor G1 to S phase transition protein 1 homolog Cytochronie p450 53 9_FD1 gi|15055218| PAGISRELV 17 gi|6636|226|pdb| Score = 29.5 bits (62), Expect = 0.89 2 gb|AC06022 DKLAAALE 1YX5|B Id = 9/9 (100%), Positives = 9/9 (100%), 6.39|Homo Chain B, Solution Gaps = 0/9 (0% sapiens 12 Structure Of S5a Query 9 BAC RP11- Uim-1UBIQUITIN VDKLAAALE 17 101P14 COMPLEX Sbjct 84 (Roswell Park gi|118595723|sp| VDKLAAALE 92 Cancer Insti- Q9P212|PLC Score = 24.8 bits (51), Expect = 23 tute Human E1_HUMAN Id = 7/9 (77%), Positives = 9/9 (100%), BAC Library) 1-phosphatidylino- Gaps = 0/9 (0%) complete se- sitol-4,5-bisphos- Query 2 quence phate AGISRELVD 10 Length = phosphodiesterase AGIS+EL+D 132604 epsilon 1 (Phospho- Sbjct 516 Score = 36.2 lipase C-epsilon-1) AGISKELID 524 bits (18) (Phosphoinositide- Score = 22.7 bits (46), Expect = 99 Expect = 5.6 specific phospho- Id = 6/8 (75%), Positives = 8/8 (100%), Id = 18/18 lipase C epsilon-1) Gaps = 0/8 (0%) (100%) (Pancreas-enriched Query 5 Gaps = 0/18 phospholipase C) SRELVDKL 12 (0%) gi|119570428|gb| SR+LV+KL Strand = EAW50043. Sbjct 284 Plus/Plus phospholipase C, SRDLVNKL 291 epsilon 1, isoform CRA_b gi|119620997|gb| EAX00592.1| eukaryotic transla- tion initiation factor 2B, subunit 4 delta, 67kDa, isoform CRA_f gi|90185247|sp| Q86W92|L1PB 1_HUMAN Liprin-beta-1 (Pro- tein tyrosine phos- phatase receptor type f polypeptide- interacting pro- tein-binding pro- tein 1) (PTPRF-interacting protein-binding protein 1) (hSGT2) gi|33337571|gb| AAQ13438.1| AF057699_1 EIF-2B-delta-like protein 54 10_B gi|6449479| HTQGCLPM 23 gi|61214481|sp| Score = 31.2 bits (66), Expect = 0.27 12 gb|AC009510. ACAASDSPA Q81ZE3|PACE Id = 9/10 (90%), Positives = 9/10 (90%), 9| CVVCSH 1_HUMAN Gaps = 0/10 (0%) Homo sapiens Protein-associating Query 14 12p11-37.2- with the carboxyl- DSPACVVCSH 23 54.4 BAC terminal domain of DSP CVVCSH RP11- ezrin (Ezrin-bind- Sbjct 438 1110J8 ing protein PACE-1) DSPMCVVCSH 447 (Roswell Park (SCY1-like protein Score = 26.5 bits (55), Expect = 6.7 Cancer In- 3) Id = 11/19 (57%), Positives = 12/19 (63%), stitute Human gi|41688566|sp| Gaps = 3/19 (15%) BAC Library) P60329|KR124 Query 1 complete se- _HUMAN HTQGCLPMACAASDSPACV 19 quence Keratin-associated H+ GC PMAC    SP CV Length = protein 12-4 Sbjct 6 145859 (High sulfur kera- HSSGC-PMACPG--SPCCV 21 Score = 676 tin-associated Score = 24.4 bits (50), Expect = 21 bits (341) protein 12.4) Id = 8/11 (72%), Positives = 9/11 (81%), Expect = 0.0 gi|51468842|ref| Gaps = 1/11 (9%) Id = 358/366 XP_374912.2| Query 6 (97%) PREDICTED: X-ray LPMACAA-SDS 15 Gaps = 0/366 radiation re- LPMAC A S+S (0%) sistance associated Sbjct 482 Strand = 1 TGF-beta LPMACPALSES 492 Plus/Minus resistance-asso- ciated protein TRAG gi|73920974|sp| Q9Y4E6|WDR 7_HUMAN WD-repeat protein 7 (TGF-beta resis- tance-associated protein TRAG) (Rabconacctin-3 beta) 6-phosphofructo- kinase type C Signal recognition particle 19 kDa Gastric mucin Con- tactin associated protein like 3 precursor Cell recognition molecule Caspr3 gi|57015409|sp| Q8IWT3|PAR C_HUMAN p53-associated parkin-like cytoplasmic protein (UbcH7 associated protein 1) 57 5_C4 gi|14245766| VQKSGWGL 9 gi|33514905|sp| Score = 21.0 bits (42), Expect = 30 dbj|AP00245 A Q9P2R3|ANF Id = 6/8 (75%), Pos = 6/8 (75%), 6.3|Homo Y1_HUMAN Gaps = 0/8 (0%) sapiens Ankyrin repeat and Query 1 genomic DNA, FYVE domain protein VQKSGWGL 8 chromosome 1 (21) V KSGW L 11q gi|22507380|ref| Sbjct 285 clone: RP11- NP_683766.1 VDKSGWSL 292 956A8 com- G protein-coupled Score = 21.0 bits (42), Expect = 210 plete se- receptor Id = 5/6 (83%), Positives = 6/6 (100%), quence gi|32171177|ref| Gaps = 0/6 (0%) Length = NP_037529.2 Query 1 114109 Over-expressed VQKSGW 6 Score = 961 breast tumor +QKSGW bits(485) protein Sbjct 203 Expect = 0.0 Adaptor-related IQKSGW 208 Id = 485/485 protein complex (100%) (21) Gaps = 0/485 mRNA decapping (0%) enzyme 2 DCP2 Strand = decapping enzyme Plus/Plus (Nudix motif 20) (20.2) Superoxide dismutase, mito- chondrial precursor Sterol regulatory element binding protein cleavage activating protein Kallikrein 4 pre- cursor (Prostase) 62 11_C gi|18450186| KKKKKGV 8 gi|3264861|gb| Score = 23.5 bits (48), Expect = 45 7 gb|AC09316 G AAC78729.1| Id = 7/7 (100%), Positives = 7/7 (100%), 8.3|Homo eukaryotic transla- Gaps = 0/7 (0%) sapiens BAC tion initiation Query 1 clone RP11- factor eIF3, p35 KKKKKGV 7 148M21 from subunit KKKKKGV 7, complete gi|119597676|gb| Sbjct 220 sequence EAW77270.1 KKKKKGV 226 Length = eukaryotic transla- Score = 21.0 bits (42), Expect = 262 148689 tion initiation Id = 7/8 (87%), Positives = 7/8 (87%), Score = 58.0 factor 3, subunit 1 Gaps = 0/8 (0%) bits (29) alpha, 35 kDa, Query 1 Expect = 1e− isoform KKKKKGVG 8 05 gi|119586121|gb| KKKKKG G Id = 41/45 EAW65717. Sbjct 723 (91%) cyclin-dependent KKKKKGGG 730 Gaps = 0/45 kinase-like 1 Score = 20.6 bits (41), Expect = 351 (0%) (CDC2-related Id = 6/6 (100%), Positives = 6/6 (100%), Strand = kinase), isoform Gaps = 0/6 (0%) Plus/Minus CRA_d Query 1 gi|18087335|gb| KKKKKG 6 AAL58838.1| KKKKKG AF390028_1 Sbjct 249 serine/threonine KKKKKG 254 protein kinase kkialre- like 1 gi|119625903|gb| EAX05498.1| signal recognition particle 72 kDa, isoform CRA_c gi|119618653I|gb| EAW98247.1 calcium/calmodulin- dependent protein kinase kinase 2, beta, isoform gi|119609183|gb| EAW88777.1 chromodomain helicase DNA binding protein 4 Length = 1911 gi|119589505|gb| EAW69099.1 general transcrip- tion factor IIF, polypeptide 1, 74 kDa, isoform CRA_b 63 2_G3 gi|9801555| LMLPGLSL 20 gi|49522560|gb| Score = 26.5 bits (55), Expect = 7.0 emb|AL07933 PGTLGVRG AAH73937.1| Id = 16/38 (42%), Positives = 16/38 (42%), 5.29|Human SLSK IGKC protein Gaps = 19/38 (50%) DNA sequence gi|18676842|dbj| Query1 from clone BAB85036.1| LML--PG-----------LSLPGTLG------VRGSLS 19 RP1-132F21 unnamed protein LML  PG           LSLP TLG       R SLS on chromosome product Sbjct11 20, complete gi|23111007|ref| LMLWVPGSSGDVVMTQSPLSLPVTLGQPASISCRSSLS 48 sequence NP_076957.3| Score = 26.1 bits (54), Expect = 9.3 Length = hypothetical Id = 8/9 (88%), Positives = 8/9 (88%), 71117 protein Gaps = 0/9 (0%) Score = 196 LOC79018 Query 4 bits (99) gi|73620594|sp| PGLSLPGTL 12 Expect = 4e− Q81VV7|CQ03 PGLSLP TL 48 9_HUMAN Sbjct 52 Identities = Uncharacterized PGLSLPATL 60 99/99 (100%) protein C17orf39 Score = 25.7 bits (53), Expect = 13 Gaps = 0/99 gi|27469499|gb| Id = 9/12 (75%), Positives = 10/12 (83%), (0%) AAH41829.1| Gaps = 0/12 (0%) Strand = Chromosome 17 Query 1 Plus/Minus open reading LMLPGLSLPGTL 12 frame 39 L+L GL LPGTL gi|119623763|gb| Sbjct 4 EAX03358.1| LLLAGLLLPGTL 15 corneodesmosin Score = 25.2 bits (52), Expect = 17 gi|45477124|sp| Id = 8/9 (88%), Positives = 8/9 (88%), Q86UW6|N4B Gaps = 0/9 (0%) P2_HUMAN Query 3 NEDD4-binding LPGLSLPGT 11 protein 2 LPGL LPGT (BCL-3-binding Sbjct 297 protein) LPGLDLPGT 305 65 10_F gi|18449730| QIMRSG 7 gi|113420304|ref| Score = 26.1 bits (54), Expect = 6.7 10 gb|AC09269 V XP_001127830.1| Id = 7/7 (100%), Positives = 7/7 (100%), 4.6|Homo PREDICTED: similar Gaps = 0/7 (0%) sapiens 3q to piwi-like 2 Query 1 BAC gi|98990269|gb| QIMRSGV 7 RP11-172A10 ABF60230.1| QIMRSGV (Roswell Park SARP Sbjct 548 Cancer Insti- gi|116241248|sp| QIMRSGV 554 tute Human Q8N9B4|AN Score = 21.8 bits (44), Expect = 128 BAC Library) R42_HUMAN Id = 7/8 (87%), Positives = 7/8 (87%), complete se- Ankyrin repeat Gaps = 1/8 (12%) quence domain-containing Query 1 Length = protein 42 QIM-RSGV 7 151549 gi|119607118|gb| QIM RSGV Score = 444 EAW86712.1 Sbjct 115 bits (224) protein-L- QIMLRSGV 122 Expect = 2e− isoaspartate (D- Score = 20.6 bits (41), Expect = 308 122 aspartate) O- Id = 5/6 (83%), Positives = 6/6 (100%), Id = 224/224 methyltransferase Gaps = 0/6 (0%) (100%) domain containing Query 1 Gaps = 0/224 1, isoform QIMRSG 6 (0%) gi|119580466|gb| QIMR+G Strand = EAW60062.1 Sbjct 21 Plus/Minus MCM5 minichromosome QIMRTG 26 maintenance defic- ient 5, cell divi- sion cycle 46 (S. cerevisiae), iso- form CRA_c gi|27552808|gb| AAH42923.1| Talin 1 gi|546571|gb| AAB30677.1| Wnt4 product [human, breast cell lines, Peptide Partial, 120 aa] 71 9_H4 gi|8748861| SRYW 4 gi|122892392|gb| Score = 18.9 bits (37), Expect = 571 gb|AC019181. ABM67263.1 Id = 4/4 (100%), Positives = 4/4 (100%), 4|Homo immunoglobulin Gaps = 0/4 (0%) sapiens BAC heavy chain Query 1 clone RP11- variable region SRYW 4 272E3 from gi|119608878|gb| SRYW 2, complete EAW88472.1 Sbjct 995 sequence G protein-coupled SRYW 998 Length = receptor 112, 190998 isoform CRA_b Score = 240 gi|119599986|gb| bits (121) EAW79580.1 Expect = 1e− immunoglobulin 60 superfamily, member Id = 121/121 11, isoform CRA_b (100%) gi|119591313|gb| Gaps = 0/121 EAW70907.1 (0%) thyroid hormone Strand = receptor interactor Plus/Minus 12, isoform CRA_l gi|119588597|gb| EAW68191.1 F-box protein 3, isoform gi|110735406|ref| NP_573438.2|pro- tein tyrosine phosphatase, receptor typeU 78 2_C1 Homo sapiens TNGSKK 11 gi|62906863|sp| Score = 21.8 bits (44), Expect = 127 1 chromosome 8, EKKLXF P30084|ECHM Identities = 8/10 (80%), Positives = 9/10 clone LXXXXK _HUMAN (90%), Gaps = 0/10 (0%) RP11-35G22, Enoyl-CoA Query 1 complete hydratase, mito- TNGSKKEKKL 10 sequence chondrial precursor T GSKK+KKL Score = 40.1 (Short chain enoyl- Sbjct 1346 bits (20) CoA hydratase) TRGSKKKKKL 1355 Expect = 0.99 (SCEH) (Enoyl-CoA Score = 21.4 bits (43), Expect = 170 Identities = hydratase 1) Identities = 6/6 (100%), Positives = 6/6 20/20 (100%) gi|51702191|sp| (100%), Gaps = 0/6 (0%) Q9H2Y7|ZF10 Query 5 6_HUMAN KKEKKL 10 Zinc finger protein Sbjct 312 106 KKEKXL 317 homolog (Zfp-106) Score = 21.4 bits (43), Expect = 170 gi|57284109|emb| Identities = 9/20 (45%), Positives = 9/20 CAI43036.1| (45%) TAF7-Iike RNA Gaps = 11/20 (55%) polymerase II, TATA Query 1 box binding TNG-----------SKKEKK 9 protein (TBP)- TNG           SKKEKK associated Sbjct 10 factor, 50 kDa TNGQPDQQAAPKAPSKKEKK 29 gi|13603873|gb| Score = 21.0 bits (42), Expect = 229 AAK31974.1| Identities = 6/6 (100%), Positives = 6/6 TBP-associated (100%), Gaps = 0/6 (0%) factor II Q Query 4 gi|401357|sp| SKKEKK 9 Q02004|VGLM_(—) Sbjct 185 DUGBV SKKEKK 190 M polyprotein precursor /Contains: Glyco- protein G2; Non- structural protein NS-M; Glycoprotein G1| gi|6671334|eb| AAF23161.1| disabled-2 gi|1706487|sp| P98082|DAB2_(—) HUMAN Disabled homolog 2 (Differentially expressed protein 2) (DOC-2) gi|1706487|sp| P98082|DAB2_(—) HUMAN Disabled homolog 2 (Differentially expressed protein 2) (DOC-2) gi|1110539|gb| AAB19032.1| mitogen-responsive phosphoprotein gi|56202443|emb| CAI20904.1| myeloidVlymphoid or mixed-lineage leukemia (trithorax homolog, Drosophila)\; translocated to, 4 gi|23306684|gb| AAN15215.1| multiple myeloma transforming gene 2 82 10_D gi|18693518| GETPELITT 22 Similar to RIKEN Score = 24.8 bits (51), Expect = 16 7 gb|AC01591 NTSQLNFR cDNA 9130427A09 Id = 7/7 (100%), Positives = 7/7 (100%), 1.8|Homo KQIVP gene (24.8) Gaps = 0/7 (0%) sapiens gi|23512334|gb| Query 7 chromosome AAH38451.1| ITTNTSQ 13 gi|18693518| Similar to RIKEN ITTNTSQ gb|AC01591 cDNA 9130427A09 Sbjct 150 1.8|Homo gene ITTNTSQ 156 sapiens gi|60593095|ref| Score = 24.0 bits (49), Expect = 29 chromosome NP_001012660.1| Id = 10/19 (52%), Positives = 12/19 (63%), 17, clone hypothetical Gaps = 6/19 (31%) RPII-1094M14, protein Query 1 complete se- LOC196996 GETPELITTNTSQLNF-RK 18 quence gi|31566169|gb| G+TP      TS+LNF RK Length = AAH53647.1| Sbjct 185 181561 Hypothetical GQTPA-----TSELNFLRK 198 Score = 496 protein Score = 24.0 bits (49), Expect = 29 bits (250) LOC84978, isoform 1 Ids = 7/8 (87%), Positives = 7/8 (87%), Expect = 6e− gi|22478057|gb| Gaps = 0/8 (0%) 138 AAH36900.1| Query 5 Id = 250/250 ZWILCH protein ELITTNTS 12 (100%) gi|38348727|ref| ELITTN S Gaps = 0/250 NP_071348.3| Sbjct 40 (0%) thyroid adenoma ELITTNNS 47 Strand = associated isoform Plus/Plus 1 gi|34493758|gb| AAO46785.1| death receptor in- teracting protein gi|57162636|emb| CA139911.1| solute carrier family 7 (cationic aa trans- porter, y+ system), member 1 gi|13177667|gb| AAH03618.1| Tara-like protein, isoform 1 gi|20455324|sp| Q9H2D6|TAR A_HUMAN TRIO and F-actin binding protein (Protein Tara) gi|252582|gb| AAB22747.1| IFN-tyk, tyk2 = interferon alpha| beta signaling pathway-related protein tyrosine kinase gi|56405328|sp| P29597|TYK2 _HUMAN Non-receptor tyrosine-protein kinase TYK2 84 10_E gi|11493153| PRMRWGX 23 gi|119578750|gb| QScore = 29.1 bits (61), Expect = 1.2 12 emb|AL1185 XXVAXWPV EAW58346.1 Id = 7/7 (100%), Positives = 7/7 (100%), 23.18| PSHW glucosidase, beta Gaps = 0/7 (0%) Human DNA se- (bile acid) 2, Query 15 quence from isoform CRA_b RSWNWGL 21 clone RP5- gi|14042883|dbj| RSWNWGL 1031J8 BAB55430.1| Sbjct 215 on chromosome unnamed protein RSWHWGL 221 20, complete product Score = 28.6 bits (60), Expect = 1.6 sequence gi|119611678|gb| Id = 10/18 (55%), Positives = 14/18 (77%), Length = EAW91272.1 Gaps = 3/18 (16%) 155213 asp (abnormal Query 1 Score = 1449 spindle)-like, VRLVRTEERLELRTRSWN 18 bits (731) microcephaly VRLVRT   +EL T++W+ Expect = 0.0 associated Sbjct 241 Id = 767/777 gi|62906885|sp| VRLVRT---MELLTQNWD 255 (98%) P27987|IP3KB Score = 26.9 bits (56), Expect = 5.1 Gaps = 0/777 _HUMAN Inosito- Id = 10/18 (55%), Positives = 12/18 (66%), (0%) trisphosphate 3- Gaps = 6/18 (33%) Strand = kinase_B Query 3 Plus/Plus gi|21361517|ref| LV---RTEERLELRTRSW 17 NP_056937.2| LV   R+EER   RT+SW SAPK substrate Sbjct 191 protein 1 LVQGARSEER---RTKSW 205 gi|54144631|ref| Score = 26.5 bits (55), Expect = 6.9 NP_112210.1| Id = 9/12 (75%), Positives = 9/12 (75%), phosphatase and Gaps = 2/12 (16%) actin regulator 1 Query 9 gi|70888311|gb| RLELRTRSWNWG 20 AAZ13758.1| RLE RTR  NWG leukocyte specific Sbjct 286 transcript 1 RLETRTR--NWG 295 gi|37222213|gb| Score = 26.1 bits (54), Expect = 9.3 AAQ89957.1| Id = 7/7 (100%), Positives = 7/7 (100%), selectin-like Gaps = 0/7 (0%) protein Query 7 gi|68655017|emb| EERLELR 13 CAF04067.1 EERLELR SEL-OB protein Sbjct 449 gi|119601416|gb| EERLELR 455 EAW81010.1 Score = 26.1 bits (54), Expect = 9.3 splicing factor, Id = 8/11 (72%), Positives = 8/11 (72%), arginine/serine- Gaps = 3/11 (27%) rich 5, isoform Query 7 CRA_e EERLELRTRSW 17 EERLE   RSW Sbjct 31 EERLE---RSW 38 86 11_C gi|15808543| REMRLKNT 40 gi|21756842|dbj| Score = 45.6 bits (100), Expect = 1e−05 3 gb|AC09307 KLQSDKRN BAC04969.1| Id = 17/22 (77%), Positives = 17/22 (77%), 3.2|Homo NFGPGAVV unnamed protein Gaps = 0/22 (0%) sapiens HTCNPSTSG product Query 19 chromosome GXVGRIT gi|21389158|gb| GPGAVVHTCNPSTSGGXVGRIT 40 19 clone AAM50513.1| GPGAV H CNPST GG  GRIT LLNLR-275E5, hypothetical Sbjct 136 complete se- protein GPGAVAHACNPSTLGGRGGRIT 157 quence gi|20139105|sp| Score = 44.3 bits (97), Expect = 4e−05 Length = Q99959|PKP2 Id = 16/21 (76%), Positives = 16/21 (76%), 2616 _HUMAN Plakophilin- Gaps=0/21 (0%) Score = 319 2 Query 20 bits (161) gi|15929032|gb| PGAVVHTCNPSTSGGXVGRIT 40 Expect = 2e− AAH14974.1| PG VVH CNPST GG  GRIT 84 Yip1 interacting Sbjct 17 Id = 180/189 factor homolog B, PGTVVHACNPSTLGGQGGRIT 37 (95%) isoform 1 Score = 43.9 bits (96), Expect = 6e−05 Gaps = 0/189 gb|2088182|dbj| Id = 19/27 (70%), Positives = 19/27 (70%), (0%) BAD92538.1| Gaps = 3/27 (11%) Strand = SLC2A11 piotein Query 15 Plus/Minus variant RNNFG-PGAVVHTCNPSTSGGXVGRIT 40 gi|66932986|ref| RN  G PGAV H CNPST GG  GRIT NP_00101938 Sbjct 468 6.1|filamin- RN--GWPGAVAHACNPSTLGGQGGRIT 492 binding LIM protein-1 isoform b gi|3335138|gb| AAC39892.1| RNA polymerase 1 40 kD subunit gi|7416053|dbj| BAA93676.1| survivin-beta gi|386941|gb| AAA59814.1| MHC HLA-DR-beta-1 chain gi|114849|sp| P20931|YYY3_(—) HUMAN Very very hypothetical B-cell growth factor (BCGF-12 kDa) gi|48474670|sp| Q9NRR6|1NP5 _HUMAN (Phosphatidyl- inositol-4,5- bispHosphate 5- phosphatase) gi|20455217|sp| O43353|RIPK2_HUMAN Receptor-interact- ing serine/ threonine-protein kinase 2 88 5_C7 gi|22657585| GITGSRPA 12 gi|119629047|gb| Score = 32.5 bits (69), Expect = 0.12 gb|ACO9156 WPTW EAX08642.1| Id = 8/8 (100%), Positives = 8/8 (100%), 4.12|Homo hCG1816309 Gaps = 0/8 (0%) sapiens chro- gi|119625915|gb| Query 5 mosome 11, EAX05510.1| SRPAWPTW 12 clone homeodomain-only SRPAWPTW RP11-732A19, protein, isoform Sbjct 18 complete se- CRA_i SRPAWPTW 25 quence gi|55961994|emb| Score = 29.9 bits (63), Expect = 0.68 Length = CA118336.1| Id = 7/7 (100%), Positives = 7/7 (100%), 211735 cAMP responsive Gaps = 0/7 (0%) Score = 321 element binding Query 6 bits (162) protein-like 1 RPAWPTW 12 Expect = 2e− gi|30046775|gb| RPAWPTW 85 AAH50548.1| Sbjct 312 Id = 194/204 KIF4A protein RPAWPTW 318 (95%) gi|119625748|gb| Score = 28.2 bits (59), Expect = 2.2 Gaps = 3/204 EAX05343.1| Id = 7/8 (87%), Positives = 7/8 (87%), (1%) kinesin family Gaps = 0/8 (0%) Strand = member 4A, Query 5 Plus/Plus isoform CRA_b SRPAWPTW 12 gi|119603999|gb| SRPAW TW EAW83593.1 Sbjct 1110 NADH dehydrogenase SRPAWATW 1117 (ubiquinone) 1 alpha subcomplex, 5, 13 kDa, isoform CRA_d 89 11_A gi|15789217| NSSA 4 gi|125950975|sp| Score = 14.2 bits (26), Expect = 14482 1 gb|AC08481 Q9Y4F3|LK Id = 4/4 (100%), Positives = 4/4 (100%), 9.17|Homo AP_HUMAN Limkain-b1 Gaps = 0/4 (0%) sapiens 12p gi|124001558|ref| Query 1 BAC RPII- NP_689799.3| NSSA 4 449P1 ubiquitin specific NSSA (Roswell Park protease 54 Sbjct 90 Cancer Insti- gi|123242853|emb| NSSA 93 tute Human CAI16881.2 BAC Library) quiescin Q6-like 1 complete se- gi|120660404|gb| quence AAI30523.1| Length = Receptor tyrosine 46530 kinase-like Score = 418 orphan receptor 2 bits (211) gi|120659834|gb| Expect = 1e− AAI30364.1| 114 BRCC2 Id = 216/218 gi|114432132|gb| (99%) AB174674.1| Gaps = 0/218 breast and ovarian (0%) cancer susceptibil- Strand = ity protein 2 Plus/Minus truncated variant gi|119628905|gb| EAX08500.1| breast cancer 2, early onset, gi|119625696|gb| EAX05291.1| TAF1 RNA polymerase 11, TATA box bind- ing protein (TBP)- associated factor, 250 kDa, isoform CRA_a gi|119626158|gb| EAX05753.1| protein phospha- tase, EF-hand calcium binding domain 2, isoform gi|119625726|gb| EAX05321.1| myeloid/lymphoid or mixed-lineage leukemia (trithorax homolog, Drosophila); trans- located to, 7, isoform 97 9_C2 gi|11875985| NSNIY 5 gi|119615275|gb| Score = 20.2 bits (40), Expect = 295 emb|AL0968 EAW94869.1| Id = 5/5 (100%), Positives = 5/5 (100%), 68.15|Human zinc finger, UBR1 Gaps = 0/5 (0%) DNA sequence type 1, isoform Query 1 from clone CRA_c NSNIY 5 RP3-493D19 on gi|82659109|ref| NSNIY chromosome NP_065816.2| Sbjct 356 6q14.3-16.1, retinoblastoma- NSNIY 360 complete se- associated factor quence 600 Length = gi|19070472|gb| 151761 AAL83880.1| Score = 757 AF348492_1 p600 bits (382) gi|33319488|gb| Expect = 0.0 AAQ05647.1| Id = 385/386 AF471472_1 Ig (99%) heavy chain Gaps = 0/386 variable region, (0%) VH3 family Strand = gi|30046973|gb| Plus/Minus AAH50539.1| Solute carrier fam- ily 14 (urea trans- porter), member1 (Kidd blood group) 99 6_C1 gi|20198509| LTLRTKTT 12 gi|89033971|ref| Score = 24.4 bits (50), Expect = 23 2 gb|AC01345 TVTV XP_931830.1 Id = 7/7 (100%), Pos = 7/7 (100%), 2.0|Homo PREDICTED: Hypothe- Gaps = 0/7 (0%) sapiens tical protein Query 2 chromosome 15 XP_931830 TLRTKTT 8 clone RP11- gi|55663026|emb| TLRTKTT 325E5 map CAH172389.1| Sbjct 134 15q21.1, com- Protocadherin 15 TLRTKTT 140 plete gi|3070500|gb| Score = 23.1 bits (47), Expect = 56 sequence AAH51787.1| Id = 8/9 (88%), Pos = 8/9 (88%), Length = Death effector Gaps = 1/9 (11%) 204745 filament-forming Query 2 Score = 831 Ced-4-like TLRT-KTTT 9 bits (419) apoptosis protein TLRT KTTT Expect = 0.0 gi|1916615|gb| Sbjct 610 Id = 446/452 AAC51239.1| TLRTSKTTT 618 (98%) Ribosomal RNA Score = 22.3 bits (45), Expect = 96 Gaps = 3/452 upstream binding Id = 7/9 (77%), Positives = 8/9 (88%), (0%) transcription Gaps = 0/9 (0%) Strand = factor Polycystic Query 4 Plus/Minus kidney and hepatic RTKTTTVTV 12 disease 1 CCM2 RT TTT+TV protein PP10187 Sbjct 245 Caspase recruitment RTTTTTLTV 253 domain protein 7 10 10_B gi|15705226| TRGTKRSW 12 gi|30983668|gb| Score = 25.2 bits (52), Expect = 12  0 10 emb|AL3908 VHSF AAP41104.1| Identities = 8/11 (72%), Positives = 9/11 76.19| Cohen syndrome 1 (81%), Gaps = 1/11 (9%) Human DNA se- protein splice Query 1 sequence from variant 2 TRGTKRSWYHS 11 clone RP11- gi|77416860|sp| TR +KR WYHS 394D2 on Q92674|FSHP Sbjct 18 chromosome 9, 1_HUMAN TRTSKR-WYHS 27 complete se- Follicle-stimulat- Score = 24.4 bits (50), Expect = 377 quence ing hormone primary Identities = 7/8 (87%), Positives = 7/8 Length = response protein (87%) 185571 (FSH primary re- Query: 2 Score = 682 sponse protein 1) GTKRSWYH 9 bits (344), gi|256180|gb| GT RSWYH Expect = 0.0 AAB23392.1| Sbjct: 1098 Ide = 365/365 cancer-associated GTVRSWYH 1105 (100%) retinopathy antigen Score = 22.3 bits (45), Expect = 96 Gaps = 0/365 gi|464600|sp| Identities = 6/8 (75%), Positives = 6/8 (0%) P35243|RECO_HUMAN (75%), Strand = Recoverin (Cancer Gaps = 0|8 (0%) Plus/Minus associated retino- Query 2 pathy protein) RGTKRSWY 9 (CAR protein) RG K SWY gi|48428608|sp| Sbjct 712 Q12789|TF3C RGKKWSWY 719 1_HUMAN General transcrip- tion factor 3C polypeptide 1 (Transcription factor IIIC-alpha subunit) (TF3C- alpha) (TFIIIC 220 kDa subunit) (TFIIIC220) (TFIIIC box B-binding subunit) gi|18375626|ref| NP_542417.11 HLA-B associated transcript-2 isoform a gi|4337110|gb| AAD18O86.1| BAT2 gi|55662017|emb| CAH70921.1| v-abl Abelson murine leukemia viral oncogene homolog 2 10 12_C gi|10440580| FFVFYLQS 20 gi|12803135|gb| Score = 27.8 bits (58), Expect = 2.2  1 4 gb|AC01554 RIMTDTKI AAH0237.1| Id = 11/25 (44%), Pos = 14/25 (56%), 2.17|Homo SPLH Nipsnap homolog 1 Gaps = 7/25 (28%) sapiens 3 BAC (27.8) Query 3 RP11-38B6 gi|8216987|emb| VFY-------LQSRIMTDTKISPLH 20 (Roswell Park CAB92443.1| V+Y       ++SRIM   KISPL Cancer Insti- Putative tumor Sbjct 260 tute Human antigen Sarcoma VYYTVPLVRHMESRIMIPLKISPLQ 284 BAC Library) antigen 1 Score = 26.5 bits (55), Expect = 5.3 complete se- gi|24419041|gb| Id = 9 12 (75%), Positives = 10 12 (83%), quence, AAL65133.2| Gaps = 1/12 (8%) Length = Ovarian cancer Query 9 15150 related tumor RI-MTDTKISPL 19 Score = 252 marker CA125 RI MTDT ISP+ bits (127) Sbjct 229 Expect = 1e− RINMTDTGISPM 240 64 Score = 23.5 bits (48), Expect = 41 Id = 137/139 Id = 8/10 (80%), Positives = 8/10 (80%), (98%) Gaps = 1/10 (10%) Gaps = 1/139 Query 10 (0%) IMTDT-KISP 18 Strand = IMTDT KI P Plus/Plus Sbjct 9801 IMTDTEKIHP 9810 10 2_C7 none PCEMELESP 16 gi|119622175|gb| Score = 28.6 bits (60), Expect = 1.6  2 HPAXCHL EAX01770.1| Id = 11/16 (68%), Positives = 12/16 (75%), hCG1986848 Gaps = 4/16 (25%) gi|23396460|sp| Query 4 O005121|BCL9_HUMAN MELES-PH--PAXCHL 16 B-cell lymphoma 9 MELES PH  P+ CHL protein (Bcl-9) Sbjct 21 (Legless hoinolog) MELESMPHSVPS-CHL 35 gi|119569780|gb| Score = 25.7 bits (53), Expect = 13 EAW49395.1 Id = 8/11 (72%), Positives = 10/11 (90%), G protein-coupled Gaps = 1/11 (9%) receptor kinase 5, Query 3 isoform CRA_b EMEGPS-PHPA 12 gi|23822155|sp| EMEGP+ P+PA Q9BX26|SYCP2_HUMAN Sbjct 572 Synaptonemal com- EMEGPNVPNPA 582 plex protein 2 Score = 24.4 bits (50), Expect = 31 (SCP-2) Id = 6/9 (66%), Positives = 8/9 (88%), gi|98991767|ref| Gaps = 0/9 (0%) NP_690000.2| Query 8 mitogen-activated SPHPAXCHL 16 protein kinase SPHP+ CH+ kinase kinase 7 Sbjct 31 interacting SPHPSLCHM 39 protein 3 Score = 24.0 bits (49), Expect = 41 gi|37538058|gb| Id = 9/13 (69%), Positives = 9/13 (69%), AAQ92939.1| Gaps = 4/13 (30%) NFkB activating Query 1 protein 1 GC---EM-EGQSP 9 gi|3182968|sp| GC   EM EGQSP O15528|CP27B Sbjct 372 _HUMAN GCLIYEMIEGQSP 384 25-hydroxyvitamin Score = 24.0 bits (49), Expect = 41 D-1 Id = 6/7 (85%), Positives = 7/7 (100%), alpha hydroxylase, Gaps = 0/7 0%) mitochondrial pre- Query 4 cursor (Cytochrome MELESPH 10 P450 subfamily ME+ESPH XXVIIB polypeptide Sbjct 1230 1) (Cytochrome p450 MEIESPH 1236 27B1) (Calcidiol 1- monooxygenase) (25- hydroxyvitamin D(3) 1-alpha-hydroxyl- ase) (VD3 1A hy- droxylase) (P450C1 alpha) (P450VD1- alpha) gi|119620981|gb| EAX00576.1| keratinocyte asso- ciated protein 3 10 8_B1 gi|15789217| NSSA 4 gi|124375836|gb| Score = 14.2 bits (26), Expect = 14482  3 2 gb|AC08481 AAI32680.1| Identities = 4/4 (100%), Positives = 4/4 9.17|Homo Transmembrane (100%) sapiens 12p channel-like 3 Gaps = 0/4 (0%) BAC RP11- gi|124001558|ref| Query 1 449P1 NP_689799. NSSA 4 (Roswell Park ubiquitin specific NSSA Cancer Insti- protease54 Sbjct 1178 tute Human gi|120660404|gb| NSSA 1181 BAC Library) AAI30523.1| Length = Receptor tyrosine 46530 kinase-like orphan Score = 480 receptor 2 bits (242) gi|120660344|gb| Expect = 1e− AAI30362.1| 132 BRCC2 Id = 251/255 gi|114432132|gb| (98%) ABI74674.1| Gaps = 0/255 breast and ovarian (0%) cancer susceptibil- Strand = ity protein 2 Plus/Minus truncated variant gi|119628905|gb| EAX08500.1| breast cancer 2, early onset, iso- form CRA_b gi|119620918|gb| EAX00513.1| restin-like 2, isoform CRA_e gi|119625696|gb| EAX05291.1| TAF1 RNA polymerase II, TATA box bind- ing protein 10 5_G4 gi|2688799| GWQERRN 11 gi|l19600046|gb| Score = 23.1 bits (47), Expect = 75  5 gb|AC003681. KLTK EAW79640.1 Id = 6/7 (85%), Positives = 7/7 (100%), 1|AC003681 WD repeat domain Gaps = 0/7 (0%) Human PAC 52, isoform CRA_a Query 3 clone RP3- gi|112180716|gb| QERRNKL 9 394A18 from AAH30606.2 +ERRNKL 22q12.1-qter, Coiled-coil domain Sbjct 704 complete se- containing 11 EERRNKL 710 quence gi|49176521|gb| Score = 23.1 bits (47), Expect = 75 Length = AAT52215.1| Id = 6/9 (66%), Positives = 8/9 (88%), 88528 cell proliferation- Gaps = 0/9 (0%) Score = 224 inducing protein 53 Query 1 bits (113) gi|1196104291|gb| GWQERRNKL 9 Expect = 2e− EAW90023.1 GW+ERR+ L 56 hCG57025, isoform Sbjct 196 Id = 119/121 CRA_c GWEERRDIL 204 (98%) gi|110611170|ref| Score = 22.7 bits (46), Expect = 100 Gaps = 0/121 NP_620688.2|ADAM Id = 6/6 (100%), Positives = 6/6 (100%), (0%) metallopeptidase Gaps = 0/6 (0%) Strand = with thrombospondin Query 4 Plus/Minus type 1 motif, 17 ERRNKL 9 preproprotein ERRNKL gi|74723116|sp| Sbjct 25 Q70EL4|UBP4 ERRNKL 30 3_HUMAN Score 22.7 bits (46), Expect = 100 Ubiquitin carboxyl- Id = 7/11 (63%), Positives = 9/11 (81%), terminal hydrolase Gaps = 2/11 (18%) 43 (Ubiquitin Query 2 thioesterase 43) WQERRN--KLT 10 gi|119582112|gb| W+ERRN  +LT EAW61708. Sbjct 219 dynactin 4 (p62), WRERRRAIRLT 229 gi|119581530|gb| EAW61126.1 polymerase (DNA directed), mu, isoform CRA_f gi|119614601|gb| EAW94195.1 SMAD specific E3 ubiquitin protein ligase gi|12018151|gb| AAG45422.1| E3 ubiquitin ligase SMURF2 gi|81022914|gb| ABB55266.1| rhabdomyosarcoma antigen MU-RMS-40.8 10 1_H4 gi|12001748| VEGKVARC 23 gi|51491211|emb| Score = 25.2 bits (52), Expect = 17  7 emb|AL3550 QSKSPGFED CAH18671.1|hypo- Id = 7/10 (70%), Positives = 9/10 (90%), 52.3|CNS05TC4 GLFGKF thetical Gaps = 0/10 (0%) Human chromo- proteingi| Query 8 some 14 DNA 119598885|gb|EAW CQSRSPGFED 17 sequence BAC 78479.1|hCG2040711 CQSKSPG ++ R-1070A8 of gi|119605408|gb| Sbjct 409 library RPC1- EAW85002.1 CQSKSPGLDN 418 11 from chro- coagulation factor Score = 24.8 bits (51), Expect = 22 mosome 14 of XII Id = 9/16 (56%), Positives = 9/16 (56%), Homo sapiens gi|116242624|sp| Gaps = 5/16 (31%) (Human), com- Q12852|M3K Query 4 plete se- 12_HUMAN KVARCQSKSPGFEDGL 19 quence Mitogen-activated KV RC     GFED L Length = protein kinase Sbjct 69 179712 kinase kinase 12 KVGRC-----GFEDNL 79 Score = 549 (Mixed lineage Score = 24.0 bits (49), Expect = 40 bits (277) kinase) Id = 7/9 (77%), Positives = 7/9 (77%), Expect = 6e− gi|561543|gb| Gaps = 0/9 (0%) 154 AAA67343.1| Query 5 Id = 277/277 serine/threonine VARCQSKSP 13 (100%) protein kinase VARCQ K P Gaps = 0/277 gi|4033480|sp| Sbjct 46 (0%) Q13595|TRA2A VARCQCKGP 54 Strand = _HUMAN Plus/Minus Transformer-2 protein homolog (TRA-2 alpha) 11 10_B LTSKTQ 8 gi|17368711|sp| Score = 23.1 bits (47), Expect = 70  0 RK Q9BZE4|NOG Identities = 8/9 (88%), Positives = 8/9 1_HUMAN (88%), Gaps = 1/9 (11%) Nucleolar GTP- Query 7 binding LT-SKTQRK 14 protein 1 (Chronic LT SKTQRK renal failure gene Sbjct 21 protein) (GTP- LTLSKTQRK 29 binding protein Score = 23.1 bits (47), Expect = 43 NGB) Id = 8/9 (88%), Positives = 8/9 (88%), gi|3153873|gb| Gaps = 1/9 (11%) AAC24364.1| Query 1 putative G-binding LT-SKTQRK 8 protein LT SKTQRK gi|27923746|sp| Sbjct 5 Q96P48|CEN LTLSKTQRK 13 D2_HUMAN Score = 23.1 bits (47), Expect = 43 Centaurin delta 2 Id = 8/9 (88%), Positives = 8/9 (88%), (Cnt-d2) Gaps = 1/9 (11%) gi|51464006|ref| Query 1 XP_497909.1| LT-SKTQRK 8 similar to dual LT SKTQRK specificity phos- Sbjct 19 phatase 5; VH1-like LTLSKTQRK 27 phosphatase 3; Score = 21.4 bits (43), Expect = 228 serine/threonine Identities = 6/6 (100%), Positives = 6/6 specific protein (100%) phosphatase Gaps = 0/6 (0%) gi|62089088|dbj| Query 8 BAD92988.1| TSKTQR 13 Rho GTPase activat- TSKTQR ing protein 5 Sbjct 684 variant TSKTQR 689 gi|55859653|emb| Score = 21.4 bits (43), Expect = 140 CA110958.1| Id = 6/6 (100%), Positives = 6/6 (100%), nuclear receptor Gaps = 0/6 (0%) subfamily 5, group Query 2 A, member 1 TSKTQR 7 gi|339550|gb| TSKTQR AAA50404.1| Sbjct 119 transforming growth TSKTQR 124 factor-beta-2 Score = 19.7 bits (39), Expect = 452 precursor Id = 6/7 (85%), Positives = 7/7 (100%), gi|62896709|dbj| Gaps = 0/7 (0%) BAD96295.1| Query 2 TATA binding TSKTQRK 8 protein interacting TSKT+RK protein 49 kDa Sbjct 820 variant TSKTKRK 826 Score = 18.9 bits (37), Expect = 814 Id = 5/5 (100%), Positives = 5/5 (100%), Gaps = 0/5 (0%) Query 4 KTQRK 8 KTQRK Sbjct 59 KTQRK 63 11 10_C gi|55734203| GDPNSS 6 gi|119588281|gb| Score = 18.5 bits (36), Expect = 1148  3 1 emb|CR8614 EAW67875.1 Id = 5/5 (100%), Positives = 5/5 (100%), 77.1|Human protein tyrosine Gaps = 0/5 (0%) DNA sequence phosphatase, Query 1 from clone receptor typeJ GDPNS 5 XX-NCIH2171_(—) gi|11961566-|gb| GDPNS 4G20, com- EAW95254.1 Sbjct 177 plete cytoplasmic linker GDPNS 181 sequence associated protein Score = 18.5 bits (36), Expect = 1148 Length = 1, isoform Id = 5/5 (100%), Positives = 5/5 (100%), 121004 gi|119607396|gb| Gaps = 0/5 (0%) Score = 577 EAW86990.1 Query 2 bits (291) telomeric repeat DPNSS 6 Expect = 2e− binding factor DPNSS 162 (NIMA-interacting) Sbjct 2031 Id = 315/315 1 DPNSS 2035 (100%) gi|119613827|gb| Gaps = 0/315 EAW93421.1 (0%) TNF reccptor- Strand = associated factor Plus/Plus 5, isoform CRA_a gi|119597101|gb| EAW76695.1 transformation| transcription domain-associated protein, gi|119572538|gb| EAW52153.1 protease, serine, 36, isoform gi|116242829|sp| Q9Y4A5|TR RAP_HUMAN Transformation| transcription do- main-associated protein (350/400 kDa PCAF- associated factor) gi|119597105|gb| EAW76699.1 transformation| transcription do main-associated protein, isoform CRA_g 11 9_C6 gi|18873821| GDPNSVY 7 gi|62898359|dbj| Score = 21.0 bits (42), Expect = 228  6 gb|AC01027 BAD97119.1| Id = 6/7 (85%), Positives = 6/7 (85%), 9.5|Homo guanine monophos- Gaps = 0/7 (0%) sapiens phate synthetase Query = 1 chromosome 5 variant GDPNSVY 7 clone CTC- G PNSVY 533D18, com- Sbjct 171 plete se- GGPNSVY 177 quence Length = 125746 Score = 109 bits (55) Expect = 1e− 21 Id = 63/67 (94%) Gaps = 0/67 (0%) Strand = Plus/Minus 11 6_G3 gi|244d18207| YVYRLPIRS 26 gi|4758412|ref| Score = 29.1 bits (61), Expect = 1.2  9 gb|AC08739 LTGGAGGG NP_004472.1| Id = 11/15 (73%), Positives = 12/15 (80%), 2.10|Homo GRQEAWM Polypeptide N- Gaps = 1/15 (6%) sapiens GT acetylgalactosa- Query 10 chromosome minyltransferase 2 LTGGAGGG-CRQEAW 23 17, clone Growth/differentia- L GGAGGG GR+E W RP11-676J12, tion factor-11 Ras Sbjct 31 complete se- and Rab interactor LAGGAGGGAGRKEDW 45 quence 3 Length = 196772 Score = 1051 bits (530) Expect = 0.0 Id = 534/536 (99%) Gaps = 0/536 (0%) Strand = Plus/Minus 12 12_C gi|22002645| VWAA 4 gi|31419318|gb| Score = 16.8 bits (32), Expect = 1771  0 12 emb|AL1588 AAH52995.1| Identities = 4/4 (100%), Positives = 4/4 27.27|Human SAPS2 protein (100%), DNA sequence gi|32810412|gb| Gaps = 0/4 (0%) from clone AAO65542.1| Query 1 RP11-330M2 on UDP-glucuronosyl- VWAA 4 chromosome 9, transferase 2B7 VWAA complete se- gi|13699834|ref| Sbjct 684 quence NP_085080.1| VWAA 687 Length = matrilin 4 186108 isoform 2 Score = 531 gi|55962492|emb| bits (268) CAI17303.1| Expect = 1e− receptor tyrosine 148 kinase-like orphan Id = 272/274 receptor 2 (99%) gi|57242755|ref| Gaps = 0/274 NP_055759.3| (0%) calsyntenin 1 Strand = isoform 2 Plus/Plus gi|45767643|gb| AAH67427.1| Cytochrome P450, family 1, sub- family A, poly- peptide 2 ATP- binding cassette Solute carrier family 26, member 11 12 12_D gi|37694066| CTNGILLK 10 gi|29568109|ref| Score = 24.4 bits (50), Expect = 22  3 2 ref|NM_1979 KI NP_055520.2| Id = 7/8 (87%), Positives = 7/8 (87%), 55.1|Homo dedicator of Gaps = 0/8 (0%) sapiens cytokinesis 4 Query 3 chromosome 15 gi|29335973|gb| NGILLKKI 10 open reading AAO73565.1| N ILLKKI frame 48 DOCK4 Sbjct 1132 (C15orf48), gi|18089123|gb| NSILLKKI 1139 transcript AAH20733.1| Score = 23.1 bits (47), Expect = 54 variant 1, Sushi-repeat- Id = 6/7 (85%), Positives = 7/7 (100%), mRNA containing protein, Gaps = 0/7 (0%) Length = X-linked 2 Query 1 815 gi|22001626|sp| CTNGILL 7 Score = 214 Q9UKD1|GM CTNG+LL bits (108) EB2_HUMAN Sbjct 135 Expect = 1e− Glucocorticoid mod- CTNGVLL 141 53 ulatory element- Score = 22.3 bits (45), Expect = 98 Id = 115/118 binding protein 2 Id = 6/8 (75%), Positives = 8/8 (100%), (97%) (GMEB-2) Gaps = 0/8 (0%) Gaps = 0/118 (Parvovirus Query 3 (0%) initiation factor NGILLKKI 10 Strand = p79)(PIF p79) NGI+L+KI Plus/Minus (DNA-binding Sbjct 149 protein p79PIF) NGIMLRKI 156 516.1| Score = 21.8 bits (44), Expect = 131 nucleic acid hel- Id = 6/9 (66%), Positives = 8/9 (88%), icase DDXx Gaps = 0/9 (0%) gi|62897301|dbj| Query 1 BAD96591.1| CTNGILLKK 9 Fanconi anemia, CT G+LL+K complemeatation Sbjct 678 group C variant CTTGVLLRK 686 gi|16506130|dbj| BAB70696.1| phosphatidylino- sitol 3-kinase- related protein kinase gi|62087846|dbj| BAD92370.1| ataxia telangi- ectasia mutated protein isoform 1 variant gi|32483361|ref| NP_863658.1| apoptotic protease activating factor isoform d 12 8_G8 gi|11544447| HSHISNRKT 30 gi|31874131|emb| Score = 28.6 bits (60), Expect = 1.1  6 emb|AL1393 TNGYLEVA CAD97974.1| Id = 8/10 (80%), Pos = 10/10 (100%), 49.36|Human PTWKGKAG hypothetical Gaps = 0/10 (0%) DNA sequence QGFGH protein Query 20 from clone gi|15214067|sp| WKGKAGQGFG 29 RP11-261P9 O95405|ZFYV W+G+AGQGFG on chromosome 9_HUMAN Sbjct 5 20 (Receptor activa- WRGRAGQGFG 14 Length = tion anchor) core = 26.5 bits (55), Expect = 4.8 68754 Serine protease- Id = 12/19 (63%), Pos = 12/19 (63%), Score = 1035 like protein Smad Gaps = 4/19 (21%) bits (522) anchor for receptor Query 4 Expect = 0.0 activation ISNRK--TTNGYLEVAPTW 20 Id = 623/647 gi|55741557|ref| IS RK  TT G  EVAP W (96%) NP_055809.1 Sbjct 578 Gaps = 10/647 Mitogen-activated ISARKPFTTLG--EVAPVW 594 (1%) protein kinase Score = 23.5 bits (48), Expect = 38 Strand = binding protein 1 Id = 9/16 (56%), Pos = 12/16 (75%), Plus/Minus gi|182775|gb| Gaps = 2/16 (12%) AAA58487.1| Query 7 v-fos transforma- RKTTNGY-LEVAPTWK 21 tion effector RKTT  Y ++V P+WK protein Sbjct 609 Mothers against RKTTL-YDMDVEPSWK 623 decapentaplegic homolog interacting protein (Madh- interacting pro- tein) Novel serine protease Adapter related protein complex 3 beta 1 subunit Dickkopf- like protein 1 precursor 12 9_H7 gi|23268261| YRLMEEN 7 gi|119592213|gb| Score = 24.4 bits (50), Expect = 22  7 gb|AC12978 EAW71807.1 Id = 6/6 (100%), Positives = 6/6 (100%), 2.3|Homo zona pellucida Gaps = 0/6 (0%) sapiens BAC glycoprotein 3 Query 2 clone RP111- (sperm receptor), RLMEEN 7 28O7 from isoform CRA_a RLMEEN UL, complete gi|113415975|ref| Sbjct 124 sequence XP_0011291 RLMEEN 129 Length = 78.1|PREDICTED: Score = 24.4 bits (50), Expect = 22 66860 similar to multiple Id = 7/8 (87%), Positives = 7/8 (87%), Score = 172 coiled-coil Gaps = 1/8 (12%) bits (87) GABABR1-binding Query 1 Expect = 3e− protein YRL-MEEN 7 41 gi|28436730|gb| YRL MEEN Id = 92/94 AAH47075.1| Sbjct 564 (97%) Janus kinase and YRLEMEEN 571 Gaps = 0/94 microtubule inter- Score = 18.5 bits (36), Expect = 1331 (0%) acting protein 1 Id = 4/5 (80%), Positives = 5/5 (100%), Strand = gi|38641276|gb| Gaps = 0/5 (0%) Plus/Plus AAR26235.1| Query 2 MARLIN1 RLMEE 6 gi|119602807|gb| RLM+E EAW82401.1 Sbjct 207 janus kinase and RLMDE 211 microtubule inter- Score = 24.4 bits (50), Expect = 22 acting protein 1, Id = 7/8 (87%), Positives = 7/8 (87%), isoform CRA_a Gaps = 1/8 (12%) gi|5817226|emb| Query 1 CAB53703.1| YRL-MEEN 7 hypothetical YRL MEEN protein Sbjct 571 gi|11995070|dbj| YRLEMEEN 578 BAB20049.1| Score = 18.5 bits (36), Expect = 1331 calmodulin- Id = 4/5 (80%), Positives = 5/5 (100%), dependent Gaps = 0/5 (0%) phosphodiesterase Query 2 gi|16151615|emb| RLMEE 6 CAC82208.1 RLM+E 3′5′-cyclic Sbjct 207 nucleotide RLMDE 211 phosphodiesterase 1A5 gi|119631376|gb| EAX10971.|1 phosphodiesterase 1A, calmodulin- dependent, isoform CRA_g gi|1705942|sp| P54750|PDE1A _HUMAN Calcium/calmodu- lin-dependent 3′, 5′-cyclic nucleo- tide phosphodi- esterase 1A gi|119604572|gb| EAW84166.1 SW1/SNF related, matrix associated, actin dependent regulator of chromatin, sub- family a, member 4 gi|4056413|gb| AAC97987.1| SN24_HUMAN; nuclear protein GRB1; home- otic gene regula- tor; SNF2-BETA gi|738309|prf|| 1924378A nucler protein GRB1 gi|505088|dbj| BAA05143.1| transcriptional activator hSNF2b gi|109734817|eb| AAI17695.1| DOCK4 protein gi|40254834|ref| NP_006603.2| kinesin family member 1C gi|116242606|sp| O43896|KIF1 C_HUMAN Kinesin-like pro- tein KIF1C gi|109734809|gb| AA117689.1| Dedicator of cytokinesis 4 12 9_D1 gi|20800377| KSFKVNI 13 gi|113415614|ref| Score = 26.1 bits (54), Expect = 9.5  8 1 gb|AC11661 SLMFCK XP_0011282 Id = 9/20(45%), Positives = 10/20 (50%), 8.4|Homo 50.1|PREDICTED: Query 2 sapiens BAC hypothetical SFKYNISL--------MFCK 13 clone RP11- protein S+ Y ISL        MFCK 98L17 from gi|119611411|gb| Sbjct 7 4, complete EAW91005. SYMYQISLQQAFCTVIMFCK 26 sequence astrotactin, Score = 25.7 bits (53), Expect = 13 Length = isoform CRA_a Id = 7/8 (87%), Positives = 7/8 (87%), 153040 gi|119602676|gb| Gaps = 1/8 (12%) Score = 220 EAW82270.1 Query 5 bits (111) hCG1995022 YNISLMFC 12 Expect = 2e− gi|20270245|ref| YNI LMFC 55 NP_612474.1| Sbjct 686 Id = 113/114 GLI-Kruppel family YNI-LMFC 692 (99%) member GLI4 Score = 24.8 bits (51), Expect = 23 Gaps = 0/114 gi|33302619|sp| Id = 8/10 (80%), Positives = 9/10 (90%) (0%) P10075|GLI4 Gaps = 1/10 (10%) Strand = HUMAN Query 1 Plus/Minus Zinc finger KSFKYNISLM 10 protein GLI4 KSFKYN SL+ (krueppel-related Sbjct 190 zinc finger protein KSFKYN-SLL 198 4) (Protein HKR4) gi|119576919|gb| EAW56515.1 CTTNBP2 N-terminal like, gi|119570340|gb| EAW49955.1 Rho GTPase activa- ting protein 19, isoform CRA_d 12 6_C3 none GISTLK 6 gi|119574273|gb| Score = 20.6 bits (41), Expect = 426  9 EAW53888.1 Id = 6/6 (100%), Positives = 6/6 (100%), hCG2040542 Gaps = 0/6 (0%) gi|119628905|gb| Query 1 EAX08500.1| GISTLK 6 breast cancer 2, GISTLK early onset, Sbjct 170 isoform CRA_b GISTLK 175 gi|55957538|emb| Score = 19.3 bits (38), Expect = 1030 CA113195.1| Id = 6/10 (60%), Positives = 6/10 (60%), BRCA2 Gaps = 0/10 (0%) gi|14424438|sp| Query 2 P51587|BRCA ISTLKXXXXK 11 2_HUMAN ISTLK    K Breast cancer type Sbjct 580 2 susceptibility ISTLKKKTNK 589 protein (Fanconi Score = 18.9 bits (37), Expect = 1381 anemia group D1 Id = 6/10 (60%), Positives = 6/10 (60%), protein) Gaps = 0/10 (0%) gi|42793995|gb| Query 3 AAH66592.1| STLKXXXXKL 12 CUTL1 protein STLK    KL gi|119570613|gb| Sbjct 336 EAW50228.1 STLKQLEEKL 345 cut-like 1, CCAAT displacement pro- tein (Drosophila), isoform gi|44887461|gb| AAS48058.1| T cell antigen re- ceptor beta chain gi|1552504|gb| AAC80198.1| V segment transla- tion product 13 10_D gi|15451718| GDPNS 5 gi|119615660|gb| Score = 18.5 bits (36), Expect = 957  0 3 gb|AC02297 EAW95254.1 Id = 5/5 (100%), Positives = 5/5 (100%), 3.5|Homo cytoplasmic linker Gaps = 0/5 (0%) sapiens, associated protein Query 1 clone 1, isoform CRA_c GDPNS 5 RP11-473O4, gi|119607399|gb| GDPNS complete se- EAW86993.1 Sbjct 216 quence telomeric repeat GDPNS 220 Length = binding factor 194367 (NIMA-interacting) Score = 180 1, isoform CRA_a bits (91) gi|119612111|gb| Expect = 2e− EAW91705.1 43 protein phosphatase Id = 91/91 2C, magnesium- (100%) dependent, cata- Gaps = 0/91 lytic subunit, (0%) isoform Strand = gi|119588281|gb| Plus/Minus EAW67875.1 protein tyrosine phosphatase, re- ceptor type, J, isoform CRA_c gi|11342O339|ref |XP_944412. 2|PREDICTED: similar to growth inhibition and differentiation related protein 86  3 10_C gi|19807899| EMKRHIST 37 gi|585487|sp| Score = 27.8 bits (58), Expect = 5.1 3 gb|AC11076 LRWKTCLN Q07325|SCYB9_(—) Id = 14/28 (50%), Pos = 17/28 (60%), 9.2|Homo ANMKELLE HUMAN Gaps = 11/28 (39%) sapiens BAC IKVTGKIRY Small inducible Query 12 clone RPII- NQGL cytokine B9 pre- ISTLRWK----TCLN---ANMKELLEIK 32 141B14 cursor (CXCL9) I+TL  K    TCLN   A++ KEL IK from 2, com- Gamma interferon Sbjct 64 plete se- induced monokine IATL--KNGVQTCLNPDSADVKEL--IK 87 quence (MIG) (27.8) Score = 26.1 bits (54), Expect = 9.0 Length = gi|62898822|dbj| Id = 9/13 (69%), Pos = 12/13 (92%), 135317 BAD97265.1 Gaps = 1/13 (7%) Score = 599 Serologically de- Query 19 bits (302) fined colon cancer MKELLEIKVTGKI 31 Expect = 2e− antigen 33 variant M+EL+E KVTGK+ 168 Antigen NY-CO-33 Sbjct 270 Id = 312/317 (26.1) MEELVE-KVTGKV 281 (98%) gi|31077164|sp| Score = 24.0 bits (49), Expect = 39 Gaps = 0/317 O95757/HS74 Id = 6/6 (100%), Pos = 6/6 (100%), (0%) L_HUMAN Gaps = 0/6 (0%) Strand = Heat shock 70 kDa Query 8 Plus/Plus protein 4L TLRWKT 13 gi|21315086|gb| TLRWKT AAH30792.1| Sbjct 403 Cyclin-dependent TLRWKT 408 kinase 5, regulatory subunit 1 Transient receptor potential cation channel subfamily V member 3 Vanilloid receptor-like 3 (VRL-3) Dehydrodolichyl diphosphatc syn- thase Tyrosyl-tRNA synthetase Putative mitochon- drial outer mem- brane protein im- port receptor  5 10_G gi|21686938| CINMDSPPK 11 gi|57208724|emb| Score = 24.8 bits (51), Expect = 338 8 gb|AC11605 QC CA142568.1 Id = 6/7 (85%), Pos = 7/7 (100%), 0.3|Homo GNAS complex locus Gaps = 0/7 (0%) sapiens BAC (OTTHUMP0000003173 Query 2 clone RP11- 7) INMDSPP 8 427F2 from Guanine nucleotide +NMDSPP 2, complete binding protein, Sbjct 195 sequence alpha stimulating VNMDSPP 201 Length = activity polypep- Score = 22.3 bits (45), Expect = 131 165225 tide 1 Id = 7/9 (77%), Pos = 7/9 (77%), Score = 343 gi|68565390|sp| Gaps = 2/9 (22%) bits (173) Q14676|MDC1_HUMAN Query 4 Expect = 3e− Mediator of DNA MDSPP--KQ 10 92 damage checkpoint MDSPP  KQ Id = 179/182 protein 1 (Nuclear Sbjct 1818 (98%) factor with BRCT MDSPPHQKQ 1826 Gaps = 0/182 domains 1) Score 22.3 bits (45), Expect = 131 (0%) gi|50400452|sp| Id = 8/12 (66%), Pos = 8/12 (66%), Strand = Q7Z569/BRAP Gaps = 2/12 (16%) Plus/Plus _HUMANBRCA1-associ- Query 1 ated protein CINM--DSPPKQ 10 (Impedes mitogenic CIN   DSP KQ signal propagation) Sbjct 110 (IMP) CINAAPDSPSKQ 121 gi|739072|prf|| 2002263A E1A-assoc protein gp130 Retinoblastoma-like protein 2 gi|55957867|emb| CA113220.1 E74-like factor 1 (ets domain trans- cription factor) RUN and SH3 domain containing protein 2 Ezrin-radixin- moesin binding phosphoprotein 50 Impedes mitogenic signal propagation (IMP) Membrane-associated nucleic acid bind- ing protein E1A-associated protein gp130 13 6_C8 gi|21263318| GAGWEWV 7 gi|18089035|gb| Score = 23.1 bits (47), Expect = 38 gb|AC10444 AAH20586.1| Id = 5/5 (100%), Pos = 5/5 (100%), 1.2|Homo SERS14 protein Gaps = 0/5 (0%) sapiens gi|49256613|gb| Query 3 chromosome 3 AAH73912.1| GWEWV 7 clone RP11- ACCN4 protein GWENV 901H12, Regenerating islet Sbjct 946 complete se- derived protein 3 GWEWV 950 quence alpha precursor Score = 22.7 bits (46), Expect = 50 Length = Pancreatitis Id = 5/5 (100%), Positives = 5/5 (100%), 2177320 associated protein Gaps = 0/5 (0%) Score = 262 1 (PAP) (21.8) Query 2 bits (132) ELAM-1 ligand AGWEW 6 Expect = 4e− fucosyltransferase AGWEW 67 Mitochondrial ri- Sbjct 255 Id = 160/171 bosomal protein AGWEW 259 (93%) bMRP64 Gaps = 1/171 (0%) Strand = Plus/Minus 14 6_D4 gi|15004913| PLCLASLLS 57 gi|57209194|emb| Score = 32.9 bits (70), Expect = 0.19 gb|AC00947 FIVCLFHFR CAI41407.1 Id = 8/9 (88%), Positives = 9/9 (100%), 5.5|Homo YLPTILLPP Dedicator of Gaps = 0/9 (0%) sapiens BAC I cytokinesis 11 Query 12 clone RP11- LKHKCNDR DOCK11 protein VCLFHFRYL 20 285F23 from MHLTCFGS Cdc42-associated VCLFHFRY+ 2, complete AKALMYSL guanine nucleotide Sbjct 1319 sequence SNNRC exchange factor VCLFHFRYM 1327 Score = 835 ACG|DOCK11 Score = 30.3 bits (64), Expect = 1.7 bits(421) gi|13634012|sp| Id = 14/44 (31%), Pos = 20/44 (45%), Expect = 0.0 Q15884|C161_(—) Gaps = 21/44 (47%) Id = 424/425 HUMAN 6 (99%) Protein C9orf61 SPLCLA-------SLLSFIVCLFHFR--------------YLPT Gaps = 0/425 (Protein X123) 28 (0%) Probable G-protein S+ C+A       S+LSF+VC F +R              +LPT Strand = coupled receptor 37 Plus/Minus 113 precursor SRMCMAISICQMLSMLSFVVCAFRYRHMFKRGWPMGTCCLFLPT (G-protein coupled 80 receptor PGR23) Wingless-type MMTV integration site family 20 10_B gi|21747795| NSFHN 5 gi|119626686|gb| Score = 20.2 bits (40), Expect = 295 11 gb|AC12486 EAX06281.1| Id = 5/5 (100%), Positives = 5/5 (100%), 4.3|Homo hCG21296, isoform Gaps = 0/5 (0%) sapiens BAC CRA_c Query 1 clone RPd11- gi|119615394|gb| NSFHN 5 570J4 from 4, EAW94988.1 NSFHN complete se- ubiqititin specific Sbjct 310 quence peptidase NSFHN 314 Length = 48, isoform CRA_b 166797 gi|119584494|gb| Score = EAW64090.1 224 bits solute carrier (113) family 6 Expect = 4e− (neurotransmitter 56 transporter, GABA), Id = 113/113 member 1, isoform (100%) CRA_a Gaps = 0/113 (0%) Strand = Plus/Plus 25 9_C4 gi|22657585| GITGSRPA 12 gi|119623989|gb| Score 29.9 bits (63), Expect = 0.68 gb|AC09156 WPTW EAX03584.1| Id = 7/7 (100%), Positives = 7/7 (100%), 4.12|Homo cAMP responsive Gaps = 0/7 (0%) sapiens element binding Query 6 chromosome protein-like 1 RPAWPTW 12 11, clone gi|14250004|gb| RPAWPTW RP11-732A19, AAH08394.1| Sbjct 312 complete se- CREBL1 protein RPAWPTW 318 quence gi|119625915|gb| Length = EAX05510.1| 211735 homeodomain-only Score = 317 protein bits (160) gi|24286115|gb| Expect = 4e− AAN46678.1| RPAWPTW 84 hyypothetical Id = 193/204 protein (94%) HGRHSVI Gaps = 3/204 (1%) Strand = Plus/Plus

REFERENCES

-   1. Alizadeh A A, et al. Distinct types of diffuse large B-cell     lymphoma identified by gene expression profiling. Nature     403:503-511, (2000). -   2. An, A, et al. A learning system for more accurate     classifications. Lecture Notes in Artificial Intelligence,     Vancouver. 1418:426-441, (1998). -   3. Aunoble B, et al. Major oncogenes and tumor suppressor genes     involved in epithelial ovarian cancer. Int J Oncol 16:567-76,     (2000). -   4. Baron A T, et al. Serum sErbB1 and Epidermal Growth Factor Levels     As Tumor Biomarkers in Women with Stage III or IV Epithelial Ovarian     Cancer Epidemiology. Biomarkers & Prevention 8:129-137, 1999. -   5. Bauer R, et al. Cloning and characterization of the Drosophila     homologue of the AP-2 transcription factor. Oncogene 17:1911-1922     (1998). -   6. Bast R C, et al. Reactivity of a monoclonal antibody with human     ovarian carcinoma. J. Clin Invest 68:1331-1337 (1981). -   7. Bast R C et al. A radioimmunoassay using a monoclonal antibody to     monitor the course of epithelial ovarian cancer. N Engl J Med 309:     883-887 (1983). -   8. Berek, J S et al. Serum interleukins-6 levels correlate with     disease status in patients with epithelial ovarian cancer. Am J     Obstet Gynecol 164: 1038-1043 (1991). -   9. Bittner, M et al. Molecular Classification of Cutaneous Malignant     Melanoma by Gene Expression Profiling. Nature 406:536-540 (2000). -   10. Blake C, et al. UCI respitory of machine learning databases     (1998). -   11. Boyd J, et al. Molecular genetic and clinical implications     [Review]. Gynecol Oncol 64:196-206 (1997). -   12. Breiman L, et al. Classification and regression trees, Wadsworth     and Brooks (1984). -   13. Buettner R, et al. An alternatively spliced form of AP-2 encodes     a negative regulator of transcriptional activation by AP-2. Mol.     Cell. Biol 13:4174-4185 (1993). -   14. Chiao P J, et al. Elevated expression of the human ribosomal S2     gene in human tumors. Molecular Carcinogenesis 5:219-231 (1992). -   15. Clark P, et al. The CN2 induction algorithm. Machine Learning     3:261-283 (1989). -   16. Coleman M P, et al. Trends in cancer incidence and mortality.     Lyon, France: IARC Scientific Publications 121:477-498 (1993). -   17. Deyo J, et al. A novel protein expressed at high cell density     but not during growth arrest. DNA and Cell Biol 17:437-447 (1998). -   18. Draghici S. The Constraint Based Decomposition, accepted for     publication in Neural Networks, to appear (2001). -   19. Einhorn, N. et al. Prospective evaluation of serum CA 125 levels     for early detection of ovarian cancer. Obstet Gynecol 80:14-18     (1992). -   20. Golub T R, et al. Molecular classification of cancer: class     discovery and class prediction by gene expression monitoring.     Science 286:531-537 (1999). -   21. Gotlieb W H, et al. Presence of interleukins in the ascites of     patients with ovarian and other intrabdominal cancers. Cytokine     4:385-390 (1992). -   22. Greenlee R T, et al. Cancer Statistics. CA Cancer J Clin 50:7-33     (2000). -   23. Heath, S. et al. Induction of oblique decision tree. In     IJCAI-93. Washington, D.C. (1993). -   24. Hogdall E V, et al. Predictive values of serum tumour markers     tetranectin, OVX1, CASA and CA125 in patients with a pelvic mass.     Int J serum tumour markers tectranectin, OVX1, CASA and CA125 in     patients with a pelvic mass. Int J Cancer 89:519-523 (2000). -   25. Holschneider C H, et al. Ovarian cancer: epidemiology, biology,     and prognostic factors. Semin Surg Oncol 1:3-10 (2000). -   26. Houts T M: Improved 2-Color Normalization For Microarray     Analyses Employing Cyanine Dyes, CAMDA (2000). Critical Assessment     of Techniques for Microarray Data Mining. Duke University Medical     Center, Dec. 18-19 (2000). -   27. Jacobs I J, et al. Potential screening tests for ovarian cancer,     in Sharp F, Mason W P, Leake R E (eds). Ovarian Cancer. London,     Chapman and Hall Medical, 197-205 (1997). -   28. Jacobs, I. Et al. Multimodal approach to screening for ovarian     cancer. Lancet 1268-271 (1988). -   29. Jacobs I, et al. The CA 125 tumor-associated antigen: a review     of the literature. Hum Reprod 4:1-12 (1989). -   30. Kacinski B M et al. Macrophage colony-stimulating factor is     produced by human ovarian and endometrial adenocarcinoma-derived     cell lines and is present at abnormally high levels in the plasma of     ovarian carcinoma patients with active disease. Cancer Cells     7:333-337 (1989). -   31. Kerr, Martin, Churchill. Analysis of variance for gene     expression microarray data. Journal of Computational Biology (2000). -   32. Kim, S Y et al. Coordinate Control of Growth and Cytokeratin 13     Expression by Retinoic Acid. Molecular Carcinogenesis 16:6-11     (1996). -   33. Kohonen T. Learning vector quantization. Neural Networks, 1     (suppl. 1):303 (1988). -   34. Kohonen T. Learning vector quantization. In the handbook of     brain theory and neural networks pp. 537-540. Cambridge Mass.: MIT     press (1995). -   35. MacBeath G. et al. Printing proteins as microarrays for     high-throughput function determination. Science 289:1760-3 (2000). -   36. Murthy K. On growing better decision trees from data.     Unpublished doctoral dissertation. John Hopkins University (1995). -   37. Musavi M. et al. On the training of radial basis functions     classifiers. Neural Networks 5:595-603 (1992). -   38. Patsner B. et al. Comparison of serum CA 125 and lipid     associated sialic acid (LASA-P) in monitoring patients with invasive     ovarian adenocarcinoma. Gynecol Oncol 30(1): 98-103 (1988). -   39. Peng Y S, et al. ARHI is the center of allelic deletion on     chromosome lp31 in ovarian and breast cancers. Int J Cancer 86:690-4     (2000). -   40. Precup D, et al. Classification using $/Phi$-machines and     constructive function approximation. In Proc. 15th International     Conf. On Machine Learning, pages 439-444. Morgan Kaufmann, San     Francisco, Calif. (1998). -   41. Poggio T, et al. Networks for approximation and learning.     Proceedings of IEEE 78(9):1481-149 (1990). -   42. Quinlan JR: C4.5: Programs for machine learning, Morgan-Kaufmann     (1993). -   43. Rumelhart, D E, et al. Learning internal representations by     error backpropagation. Parallel Distributed Processing: Explorations     in the Microstructures of Cognition, MIT Press/Bradford Books     (1986). -   44. Schwartz P E, et al. Circulating tumor markers in the monitoring     of gynecologic malignancies. Cancer 60:353-361 (1987). -   45. Schmittgen T D et al. Quantitative reverse     transcription-polymerase chain reaction to study mRNA decay:     comparison of endpoint and real-time methods. Anal Biochem,     285:194-204 (2000). -   46. Sonoda K, Nakashima M, Kaku T, Kamura T, Nakano H, Watanabe T. A     novel tumor-associated antigen expressed in human uterine and     ovarian carcinomas. Cancer 1996 77:1501-9, -   47. Nakashima M, Sonoda K, Watanabe T. Inhibition of cell growth and     induction of apoptotic cell death by the human tumor-associated     antigen RCAS1. Nat. Med. 1999 5:938-42. -   48. Lindstrom M S, Klangby U, Wiman KG. p14ARF homozygous deletion     or MDM2 overexpression in Burkitt lymphoma lines carrying wild type     p53. Oncogene. 20(17):2171-7, 2001. 

1. A diagnostic device for use in detecting the presence of head and neck squamous cell carcinoma (HNSCC) in a patient comprising detection means for detecting a presence of at least one marker in a patient's serum indicative of HNSCC, said detection means including a panel of markers for HNSCC.
 2. The diagnostic device of claim 1, wherein said detection means is selected from the group consisting essentially of an assay, a microarray, a macroarray, ELISA, a slide, and a filter containing specific biomarkers of HNSCC.
 3. The diagnostic device of claim 1, wherein said detection means is an immunoassay.
 4. The diagnostic device of claim 1, wherein said markers in said panel are chosen from the group consisting of markers listed in Table
 5. 5. A diagnostic device for use in staging head and neck squamous cell carcinoma (HNSCC) in a patient comprising detection means for detecting a presence of at least one marker indicative of stages of HNSCC, said detection means including a panel of markers for HNSCC.
 6. The diagnostic device of claim 5, wherein said detection means is selected from the group consisting essentially of an assay, a microarray, a macroarray, ELISA, a slide, and a filter containing specific biomarkers of HNSCC.
 7. The diagnostic device of claim 5, wherein said detection means is an immunoassay.
 8. The diagnostic device of claim 5, wherein said markers in said panel are chosen from the group consisting of markers listed in Table
 5. 9. Markers for head and neck squamous cell carcinoma selected from the markers listed in Table
 5. 10. A method of diagnosing head and neck squamous cell carcinoma (HNSCC), including the steps of: detecting markers in the serum of a patient indicative of the presence of HNSCC; and diagnosing the patient with HNSCC.
 11. The method of claim 10, wherein said detecting step further includes detecting markers chosen from the group consisting of markers listed in Table
 5. 12. A method of staging head and neck squamous cell carcinoma (HNSCC), including the steps of: detecting markers in the serum of a patient indicative of a stage of HNSCC with the diagnostic device of claim 1; and determining the stage of HNSCC.
 13. The method of claim 12, wherein said detecting step further includes detecting markers chosen from the group consisting of markers listed in Table
 5. 14. A method of personalized immunotherapy, including the steps of: detecting markers in the serum of a patient with the diagnostic device of claim 1 analyzing reactivity of markers in the serum to markers in the panel of the diagnostic device; identifying markers in the serum with the highest reactivity; and using the markers identified as immunotherapeutic agents personalized to the immunoprofile of the patient.
 15. The method of claim 14, wherein said using step is further defined as using the markers identified as immunotherapeutic agents for head and neck squamous cell carcinoma personalized to the immunoprofile of the patient
 16. The method of claim 15, wherein the markers in the panel are chosen from the group consisting of markers listed in Table
 5. 17. A method of making a personalized anti-cancer vaccine, including the steps of: detecting markers in the serum of a patient with the diagnostic device of claim 1; analyzing reactivity of markers in the serum to markers in the panel of the diagnostic device; identifying markers in the serum with the highest reactivity; and formulating an anti-cancer vaccine using the identified markers.
 18. The method of claim 17, wherein said formulating step is further defined as formulating an anti-head and neck squamous cell carcinoma vaccine using the identified markers.
 19. The method of claim 18, wherein the markers in the panel are chosen from the group consisting of markers listed in Table
 5. 20. An anti-cancer vaccine made according to the method of claim
 17. 21. An anti-head and neck squamous cell carcinoma vaccine made according to the method of claim
 18. 22. A method of predicting a clinical outcome in a head and neck squamous cell carcinoma (HNSCC) patient, including the steps of: analyzing a pattern of reactivity of a patient's serum with a panel of HNSCC markers; and predicting a clinical outcome.
 23. The method of claim 22, wherein the panel includes at least one marker chosen from the group consisting of markers listed in Table
 5. 24. The method of claim 22, wherein the predicting step further includes predicting a clinical outcome chosen from the group consisting of response to a particular therapeutic intervention or chemotherapeutic drug, survival, and development of neck or distant metastasis.
 25. A method of making a panel of head and neck squamous cell carcinoma markers, including the steps of: creating HNSCC cDNA libraries; replicating the HNSCC cDNA libraries; performing differential biopanning; selecting clones to be arrayed on a protein microarray; immunoreacting the microarray against HNSCC patient serum; and selecting highly reactive clones for placement on a panel.
 26. A method of personalized targeted therapy, including the steps of: detecting markers that are overexpressed or altered due to mutation in the serum of a patient with the diagnostic device of claim 1; analyzing reactivity of markers in the serum to markers in the panel of the diagnostic device; identifying markers in the serum with the highest reactivity; and using the markers identified as therapeutic targets personalized to the patient.
 27. A method of staging cancer, including the steps of: detecting RNA or protein levels of markers that are overexpressed or altered due to mutation in the serum of a patient indicative of a stage of cancer with the diagnostic device of claim 1; and determining the stage of the cancer. 